Rearrangement operations on unrooted phylogenetic networks

Rearrangement operations transform a phylogenetic tree into another one and hence induce a metric on the space of phylogenetic trees. Popular operations for unrooted phylogenetic trees are NNI (nearest neighbour interchange), SPR (subtree prune and regraft), and TBR (tree bisection and reconnection). Recently, these operations have been extended to unrooted phylogenetic networks, which are generalisations of phylogenetic trees that can model reticulated evolutionary relationships. Here, we study global and local properties of spaces of phylogenetic networks under these three operations. In particular, we prove connectedness and asymptotic bounds on the diameters of spaces of different classes of phylogenetic networks, including tree-based and level-k networks. We also examine the behaviour of shortest TBR-sequence between two phylogenetic networks in a class, and whether the TBR-distance changes if intermediate networks from other classes are allowed: for example, the space of phylogenetic trees is an isometric subgraph of the space of phylogenetic networks under TBR. Lastly, we show that computing the TBR-distance and the PR-distance of two phylogenetic networks is NP-hard.


Introduction
Phylogenetic trees and networks are leaf-labelled graphs that are used to visualise and study the evolutionary history of taxa like species, genes, or languages.While phylogenetic trees are used to model tree-like evolutionary histories, the more general phylogenetic networks can be used for taxa whose past includes reticulate events like hybridisation or horizontal gene transfer [SS03,HRS10,Ste16].Such reticulate events arise in all domains of life [TN05, RW07, MMM + 17, WWK + 17].In some cases, it can be useful to distinguish between rooted and unrooted phylogenetic networks.In a rooted phylogenetic network, the edges are directed from a designated root towards the leaves.Hence, it models evolution along the passing of time.An unrooted phylogenetic network, on the other hand, has undirected edges and thus represent evolutionary relatedness of the taxa.In some cases, unrooted phylogenetic networks can be thought of as rooted phylogenetic networks in which the orientation of the edges has been disregarded.Such unrooted phylogenetic networks are called proper [JJE + 18, FHM18].Here we focus on unrooted, binary, proper phylogenetic networks, where binary means that all vertices except for the leaves have degree three.The set of phylogenetic networks on the same taxa can be partitioned into tiers that contain all networks of the same size.
A rearrangement operation transforms a phylogenetic tree into another tree by making a small graph theoretical change.An operation that works locally within the tree is the NNI (nearest neighbour interchange) operation, which changes the order of the four edges incident to an edge e. See for example the NNI from T 1 to T 2 in Figure 1.Two further popular rearrangement operations are the SPR (subtree prune and regraft) operation, which as the name suggests prunes (cuts) an edge and then regrafts (attaches) the resulting half edge again, and the TBR (tree bisection and reconnection) operation, which first removes an edge and then adds a new one to reconnect the resulting two smaller trees.See, for example, the SPR from T 2 to T 3 and the TBR from T 3 to T 4 in Figure 1.
The set of phylogenetic trees on a fixed set of taxa together with a rearrangement operation yields a graph where the vertices are the trees and two trees are adjacent if they can be transformed into each other with the operation.We call this a space of phylogenetic trees.This construction also induces a metric on phylogenetic trees as the distance of two trees is then given as the distance in this space, that is, the minimum number of applications of the operation that are necessary to transform one tree into the other [SOW96].However, computing the distance of two trees under NNI, SPR, and TBR is NP-hard [DHJ + 97, HDRCB08, AS01].Nevertheless, both the space of phylogenetic trees and a metric on them are of importance for the many inference methods for phylogenetic trees that rely on local search strategies [Gus14,SJ17].The NNI from T 1 to T 2 changes the order of the four edges incident to e; the SPR from T 2 to T 3 prunes the edge e , and then regrafts it again; and the TBR from T 3 to T 4 first removes the edge e , and then reconnects the resulting two trees with a new edge.Note that every NNI is also an SPR and every SPR is also a TBR but not vice versa.
works [BLS17,FHMW18,GvIJ + 17,Kla19].For unrooted networks, Huber et al. [HLMW16] first generalised NNI to level-1 networks, which are phylogenetic networks where all cycles are vertex disjoint.This generalisation includes a horizontal move that changes the topology of the network, like an NNI on a tree, and vertical moves that add or remove a triangle to change the size of the network.Among other results, they then showed that the space of level-1 networks and its tiers are connected under NNI [HLMW16, Theorem 2].Note that connectedness implies that the distance between any two networks in such a space is finite and that NNI thus induces a metric.This NNI operation was then extended by Huber et al. [HMW16] to work for general unrooted phylogenetic networks.Again, connectedness of the space was proven.Later, Francis et al. [FHMW18] gave lower and upper bounds on the diameter (the maximum distance) of the space of unrooted phylogenetic network of a fixed size under NNI.They also showed that SPR and TBR can straightforwardly be generalised to phylogenetic networks, that the connectedness under NNI implies connectedness under SPR and TBR, and they gave bounds on the diameters.These bounds for SPR were made asymptotically tight by Janssen et al. [JJE + 18].Here, we improve these bounds on the diameter under TBR.
There are several generalisations of SPR on rooted phylogenetic trees to rooted phylogenetic networks for which connectedness and diameters have been obtained [BLS17, FHMW18, GvIJ + 17, JJE + 18, Jan18].For example, Bordewich et al. [BLS17] introduced SNPR (subnet prune and regraft), a generalisation of SPR that includes vertical moves, which add or remove an edge.They then proved connectedness under SNPR for the space of rooted phylogenetic networks and for special classes of phylogenetic networks including tree-based networks.Roughly speaking, these are networks that have a spanning tree that is the subdivision of a phylogenetic tree on the same taxa [FS15,FHM18].Furthermore, Bordewich et al. [BLS17] gave several bounds on the SNPR-distance of two phylogenetic networks.Further bounds and a characterisation of the SNPR-distance of a tree and a network were recently proven by Klawitter and Linz [KL19].Here, we show that these bounds and characterisation on the SNPR-distance of rooted phylogenetic networks are analogous to the TBR-distance of two unrooted phylogenetic networks.
In this paper, we study spaces of unrooted phylogenetic networks under NNI, PR (prune and regraft), and TBR.Here, the PR and the TBR operation are the generalisation of SPR and TBR on trees, respectively, where vertical moves add or remove an edge like the vertical moves of the SNPR operation in the rooted case.After the preliminary section, we examine the relation of NNI, PR, and TBR; in particular, how a sequence using one of these operations can be transformed into a sequence using another operation (Section 3).We then study properties of shortest paths under TBR in Section 4. This includes the translation of the results from Bordewich et al. [BLS17] and Klawitter and Linz [KL19] on the SNPR-distance of rooted phylogenetic networks to the TBR-distance of unrooted phylogenetic networks.Next, we consider the connectedness and diameters of spaces of phylogenetic networks for different classes of phylogenetic networks, including tree-based networks and level-k networks (Section 5).A subspace of phylogenetic networks (e.g., the space of tree-based networks) is an isometric subgraph of a larger space of phylogenetic networks if, roughly speaking, the distance of two networks is the same in the smaller and the larger space.In Section 6 we study such isometric relations and answer a question by Francis et al. [FHMW18] by showing that the space of phylogenetic trees is an isometric subgraph of the space of phylogenetic networks under TBR.We use this result in Section 7 to show that computing the TBR-distance is NP-hard.In the same section, we also show that computing the PR-distance is NP-hard.

Preliminaries
This section provides notation and terminology used in the remainder of the paper.In particular, we define phylogenetic networks and special classes thereof, and rearrangement operations and how they induce distances.Throughout this paper, X = {1, 2, . . ., n} denotes a finite set of taxa.
Phylogenetic networks.An unrooted, binary phylogenetic network N on a set of taxa X is an undirected multigraph such that the leaves are bijectively labelled with X and all non-leaf vertices have degree three.It is called proper if every cut-edge separates two labelled leaves [FHM18], and improper otherwise.This property implies that every edge lies on a path that connects two leaves.More importantly, a network can be rooted at any leaf if and only if it is proper [JJE + 18, Lemma 4.13].If not mentioned otherwise, we assume that a phylogenetic network is proper.Furthermore, note that our definition of a phylogenetic network permits the existence of parallel edges in N , i.e., we allow that two distinct edges join the same pair of vertices.An unrooted, binary phylogenetic tree T on X is an unrooted, binary phylogenetic network on X that is a tree.
Let uN n denote the set of all unrooted, binary proper phylogenetic networks on X and let uT n denote the set of all unrooted, binary phylogenetic trees on X, where X = {1, 2, . . ., n}.To ease reading, we refer to an unrooted, binary proper phylogenetic network (resp.unrooted, binary phylogenetic tree) on X simply as phylogenetic network or network (resp.phylogenetic tree or tree).Figure 2 shows an example of a tree T ∈ uT 6 , a network in N ∈ uN 6 , and an improper network M .An edge of a network N is an external edge if it is incident to a leaf, and an internal edge otherwise.A cherry {a, b} of N is a pair of leaves a and b in N that are adjacent to the same vertex.For example, each network in Figure 2 contains the cherry {1, 5}.
Tiers.We say a network N = (V, E) has reticulation number1 r for r = |E| − (|V | − 1), that is, the number of edges that have to be deleted from N to obtain a spanning tree of N .For example, the network N in Figure 2 has reticulation number three.Note that a phylogenetic tree is a phylogenetic network with reticulation number zero.Let uN n,r denote tier r of uN n , the set of networks in uN n that have reticulation number r.
Embedding.Let G be an undirected graph.Subdividing an edge {u, v} of G consists of replacing {u, v} by a path form u to v that contains at least one edge.A subdivision G * of G is a graph that can be obtained from G by subdividing edges of G.If G has no degree two vertices, there exists a canonical embedding of vertices of G to vertices of G * and of edges of G to paths of G * .Let N ∈ uN n .We say G has an embedding into N if there exists a subdivision G * of G that is a subgraph of N such that the embedding maps each labelled vertex of G * to a labelled vertex of N with the same label.
Displaying.Let T ∈ uT n and N ∈ uN n .We say N displays T if T has an embedding into N .For example, in Figure 2 the tree T is displayed by both networks N and M .Let D(N ) be the set of trees in uT n that are displayed by N .This notion can be extended to trees with fewer leaves, and to networks.For this, let M be a phylogenetic network on Y ⊆ X = {1, . . ., n}.We say N displays M if M has an embedding into N .Let P = {M 1 , . . ., M k } be a set of phylogenetic networks M i on Y i ⊆ X = {1, . . ., n}.Then let uN n (P ) denote the subset of networks in uN n that display each network in P .
Tree-based networks.A phylogenetic network N ∈ uN n is a tree-based network if there is a tree T ∈ uT n that has an embedding into N as a spanning tree.In other words, there exists a subdivision T * of T that is a spanning tree of N .The tree T is then called a base tree of N .Let uT B n denote the set of tree-based networks in uN n .For T ∈ uT n , let uT B n (T ) denote the set of tree-based networks in uT B n with base tree T .
Level-k networks.A blob B of a network N ∈ uN n is a nontrivial two-connected component of N .The level of B is the minimum number of edges that have to be removed from B to make it acyclic.The level of N is the maximum level of all blobs of N .If the level of N is at most k, then N is called a level-k network.Let uLV-k n denote the set of level-k networks in uN n .r-Burl.An r-burl is a specific type of blob that we define recursively: a 1-burl is the blob consisting of a pair of parallel edges; an r-burl is the blob obtained by placing a pair of parallel edges on one of the parallel edges of an r − 1-burl for all r > 1. See for example the network M in Figure 3. r-Handcuffed trees and caterpillars.Let T ∈ uN n and let a and b be two leaves of T .Let e and f be the edges incident to a and b, respectively.Subdivide e and f with vertices {u 1 , . . ., u r } and {v 1 , . . ., v r }, respectively, and add the edges {u 1 , v 1 }, . . ., {u r , v r }.The resulting network is an r-handcuffed tree N ∈ uN n with base tree T on the handcuffed leaves {a, b}.Note that N has reticulation number r.If the tree T is a caterpillar and a and b form a cherry of T , then the resulting network N is an r-handcuffed caterpillar.Furthermore, we call an r-handcuffed caterpillar sorted if it is handcuffed on the leafs 1 and 2 and the leafs from 3 to n have a non-decreasing distance to leaf 1. See Figure 3 for an example.
Figure 3: A network M with a 3-burl and a sorted 3-handcuffed caterpillar N .
Suboperations.To define rearrangement operations on phylogenetic networks, we first define several suboperations.Let G be an undirected graph.A degree-two vertex v of G with adjacent vertices u and w gets suppressed by deleting v and its incident edges, and adding the edge {u, w}.The reverse of this suppression is the subdivision of {u, w} with vertex v.
Let N ∈ uN n be a network, and {u, v} an edge of N .Then {u, v} gets removed by deleting {u, v} from N and suppressing any resulting degree-two vertices.We say {u, v} gets pruned at u by transforming it into the half edge {•, v} and suppressing u if it becomes a degree-two vertex.Note that otherwise u is a leaf.In reverse, we say that a half edge {•, v} gets regrafted to an edge {x, y} by transforming it into the edge {u, v} where u is a new vertex subdividing {x, y}.Note that a TBR 0 can also be seen as the operation that prunes the edge e = {u, v} at both u and v and then regrafts both ends.Hence, we say that a TBR 0 moves the edge e.Furthermore, we say that a TBR + adds the edge e and that a TBR − removes the edge e.These operations are illustrated in Figure 4.Note that a TBR 0 has an inverse TBR 0 and that a TBR + has an inverse TBR − , and that furthermore a TBR + increases the reticulation number by one and a TBR − decreases it by one.Since a TBR operation has to yield a phylogenetic network, there are some restrictions on the edges that can be moved or removed.Firstly, if removing an edge by a TBR 0 yields a disconnected graph, then in order to obtain a phylogenetic network an edge has to be added between the two connected components.Similarly, a TBR − cannot remove a cut-edge.Secondly, the suppression of a vertex when removing an edge may not yield a loop {u, u}.Thirdly, removing or moving an edge cannot create a cut-edge that does not separate two leaves.Otherwise the network would not be proper.The network N 2 can be obtained from N 1 by a TBR 0 that moves the edge {u, v} and the network N 3 can be obtained from N 2 by a TBR + that adds the edge {u , v }.Each operation has its corresponding TBR 0 and TBR − operation, respectively, that reverses the rearrangement.
The TBR 0 operation equals the well known TBR (tree bisection and reconnection) operation on unrooted phylogenetic trees [AS01].The TBR operation on trees has recently been generalised to TBR 0 on improper unrooted phylogenetic networks by Francis et al. [FHMW18].
PR.A PR (prune and regraft) operation is the rearrangement operation that transforms a network N ∈ uN n into another network N ∈ uN n with a PR + = TBR + , a PR − = TBR − , or a PR 0 that prunes and regrafts an edge e only at one endpoint, instead of at both like a TBR 0 .Like for TBR, we the say that the PR 0/+/− moves/adds/removes the edge e in N .The PR operation is a generalisation of the well-known SPR (subtree prune and regraft) operation on unrooted phylogenetic trees [AS01].Like for TBR, the generalisation of SPR to PR 0 for networks has been introduced by Francis et al. [FHMW18].
NNI.An NNI (nearest neighbour interchange) operation is a rearrangement operation that transforms a network N ∈ uN n into another network N ∈ uN n in one of the following three ways: (NNI 0 ) Let e = {u, v} be an internal edge of N .Prune an edge f (f = e) at u, and regraft it to an edge f (f = e) that is incident to v.
(NNI + ) Subdivide two adjacent edges with new vertices u and v , respectively, and add the edge {u , v }.
(NNI − ) If N contains a triangle, remove an edge of the triangle.
These operations are illustrated in Figure 5.We say that an NNI 0 moves the edge f .Alternatively, we call the edge e of an NNI 0 the axis of the operation, as the operation can also be defined as pruning f at u, and f = f at v, and regrafting f at v and f at u.The NNI operation has been introduced on trees by Robinson [Rob71] and generalised to networks by Huber et al. [HLMW16,HMW16].N 3 ) can be obtained from N 1 (resp.N 2 ) by an NNI 0 with the axis {u, v}; alternatively, N 2 can be obtained from N 1 using the NNI 0 of {1, u} to the triangle, and N 3 from N 2 by moving {1, u} to the bottom edge of the square.The labels are inherited naturally following the first interpretation of the NNI 0 moves.The network N 4 can be obtained from N 3 by an NNI + that extends x into a triangle.Each operation has its corresponding NNI 0 and NNI − operation, respectively, that reverses the transformation.
Sequences and distances.Let N, N ∈ uN n be two networks.A TBR-sequence from N to N is a sequence of phylogenetic networks such that N i can be obtained from N i−1 by a single TBR for each i ∈ {1, 2, ..., k}.The length of σ is k.The TBR-distance d TBR (N, N ) between N and N is the length of a shortest TBR-sequence from N to N , or infinite if no such sequence exists.Let C n be a class of phylogenetic networks.The TBR-distance on C n is defined like on uN n but with the restriction that every network in a shortest TBR-sequence has to be in C n .The class C n is connected under TBR if, for all pairs N, N ∈ C n , there exists a TBR-sequence σ from N to N such that each network in σ is in C n .Hence, for the TBR-distance to be a metric on C n , the class has to be connected under TBR and the TBR operation has to be reversible.We already noted above that the latter holds for TBR (and NNI and PR).For a connected class C n , the diameter is the maximum distance between two of its networks under its metric.The definition for NNI and PR are analogous.
Let C n be a subclass of C n .Then C n is an isometric subgraph of a C n under, say, TBR if for every N, N ∈ C n the TBR-distance of N and N in C n equals the TBR-distance of N and N in C n .

Relations of rearrangement operations
On trees, it is well known that every NNI is also an SPR, which, in turn, is also a TBR.We observe that the same holds for the generalisations of these operations as defined above.
Observation 3.1.Let N ∈ uN n .Then, on N , every NNI is a PR and every PR is a TBR.
For the reverse direction, we first show that every TBR can be mimicked by at most two PR like in uT n .Then we show how to substitute a PR with an NNI-sequence.
, where a TBR 0 may be replaced by two PR 0 .
Proof.If N can be obtained from N by a TBR + or TBR − , then by the definition of PR + and PR − it follows that d PR (N, N ) = 1.If N can be obtained from N by a TBR 0 that is also a PR 0 , the statement follows.Assume therefore that N can be obtained from N by a TBR 0 that moves the edge e = {u, v} of N to e = {x, y} of N .Let G be the graph obtained from N by removing e, or equivalently the graph obtained from N by removing e .If e is a cut-edge, then so is e , and without loss of generality u and x as well as v and y subdivide an edge in the same connected components of G. Furthermore, if u subdivides an edge of a pendant blob in G, then so does x.Otherwise N would not be proper.Therefore, the PR 0 that prunes e at u and regrafts it to obtain x yields a phylogenetic network N .The choices of u and x ensure that N is connected and proper.There is then a PR 0 from N to N that prunes {x, v} at v and regrafts it at y to obtain N .Hence, Let N, N ∈ uN n,r such that there is a PR 0 that transforms N into N .Let e be the edge of N pruned by this PR 0 .
Then there exists an NNI 0 -sequence from N to N that only moves e and whose length is in O(n + r).Moreover, if neither N nor N contains parallel edges, then neither does any intermediate networks in the NNI-sequence.
Proof.Assume that N can be transformed into N by pruning the edge e = {u, v} at u and regrafting it to f = {x, y}.Note that there is then a (shortest) path P = (u = v 0 , v 1 , v 2 , . . ., v k = x) from u to x in N \ {e}, since otherwise N would be disconnected.Without loss of generality, assume that P does not contain y.Furthermore, assume for now that P does not contain v.The idea is now to move e along P to f with NNI 0 .In particular, we show how to construct a sequence σ = (N = N 0 , N 1 , . . ., N k = N ) such that either N i+1 can be obtained from N i by an NNI 0 or N i+1 = N i , and such that N i contains the edge e i = {v i , v}.This process is illustrated in Figure 6.Assume we have constructed the sequence up to N i .Let g = {v i+1 , w} with w = v be the edge incident to v i+1 that is not on P .Obtain N i+1 from N i by swapping e i and g with an NNI 0 on the axis {v i , v i+1 }.Note that this preserves the path P and that N i+1 may only contain a parallel edge if N or N contains parallel edges.As a result, we get N k = N .It remains to show that every network in σ is proper.Assume otherwise and let N i+1 be the first improper network in σ.Then N i+1 contains a cut-edge e c that separates a blob B from all leaves.We claim that e c is part of P .Indeed, the pruning of the NNI 0 from N i to N i+1 has to create B and the regrafting cannot be to B, so it has to pass along e c (Figure 7).However, as P is a path, the moving edge cannot pass e c again, so all networks N j for j > i including N are improper; a contradiction.Hence, all intermediate networks N i are proper and thus σ is an NNI 0 -sequence from N to N .
Next, assume that P contains v i = v.Then first apply the process above to move v of {u, v} along P = (v = v i , v i+1 , . . ., v k ) to v k .In the resulting network, apply the process above to move u of {u, v} = {u, v k } along P = (u = v 0 , v 1 , . . ., v i ) to v i .The process again avoids the creation of a network N j with parallel edges, if neither N nor N contains parallel edges.Furthermore, from Figure 7 we get that if σ would contain improper network then u would be contained in the blob B. However, then {u, v} and e c would be edges from B to the rest of the network; again a contradiction.Lastly, note that the length of P is in O(n + r) since N contains only 2n + 3r − 1 edges.Hence, the length of σ is also in O(n + r).
Lemma 3.5.Let n ≥ 3. Let N, N ∈ uN n such that there is a PR − that transforms N into N .Let e be the edge of N removed by this PR − .Let N have reticulation number r.Then, there is an NNI 0 -sequence followed by one NNI − that transforms N and N by only moving and removing e and whose length is in O(n + r).Moreover, if neither N nor N contains parallel edges, then neither do the intermediate networks in the NNI-sequence.
Proof.Assume the PR − removes e = {u, v} from N to obtain N .If e is part of a triangle, the PR − move is an NNI − move.If e is a parallel edge, then move either u or v with an NNI 0 to obtain a network with a triangle that contains e.Then the previous case applies.So assume otherwise, namely that e is not part of a triangle or a pair of parallel edges.Then move u with an NNI 0 -sequence closer to v to form a triangle as follows.
Because removing e in N yields the proper network N , it follows that N \ {e} contains a shortest path P from u to v. Since e is not part of a triangle, this path must contain at least two nodes other than u and v. Let {x, y} and {y, v} be the last two edges on P .Consider the PR 0 that prunes {u, v} at u and regrafts it to {x, y}.Note that this creates a triangle on the vertices y, u and v.By Lemma 3.4 we can replace this PR 0 with an NNI 0 -sequence.Lastly, we can remove {u, v} with an NNI − to obtain N .The bound on the length of the NNI-sequence as well as the second statement follow from Lemma 3.4.
To conclude this section, we note that all previous results combined show that we can replace a TBR-sequence with a PR-sequence, which we can further replace with an NNI-sequence.For several connectedness results in Section 5 this allows us to focus on TBR and then derive results for NNI and PR.

Shortest paths
In this section, we focus on bounds on the distance between two specified networks.We restrict to the TBR-distance in uN n and in uN n,r , and study the structure of shortest sequences of moves.We make several observations about these sequences in general, and some about shortest sequences between two networks that have certain structure in common, e.g., common displayed networks.Hence, we get bounds on the TBR-distance between two networks, and we uncover properties of the spaces of phylogenetic networks which allow for reductions of the search space.For example, if N and N have reticulation number r, no shortest path from N to N contains a network with a reticulation number less than r.The proof of this statement relies on the following observation about the order in which TBR 0 and TBR + operations can occur in a shortest path.
Observation 4.1.Let N, N ∈ uN n,r such that there exists a TBR-sequence σ 0 = (N, M, N ) that uses a TBR + and a TBR − .Then there is a TBR 0 that transforms N into N .
Rephrasing Observation 4.1, a TBR + followed by a TBR − , or vice versa, can be replaced by a TBR 0 .This case can thus not occur in a shortest TBR-sequence.Next, we look at a TBR 0 followed by a TBR + .Lemma 4.2.Let N, N ∈ uN n with reticulation number r and r + 1 such that there exists a shortest TBR-sequence σ 0 = (N, M, N ) that starts with a TBR 0 .Then there is a TBR-sequence σ + = (N, M , N ) that starts with a TBR + .
Proof.Note that the TBR 0 from N to M of σ 0 can be replaced with a sequence consisting of a TBR + followed by a TBR − .This TBR − and the TBR + from M to N can now be combined to a TBR 0 , which gives us a sequence σ + .
Let N, N ∈ uN n,r and consider a shortest TBR-sequences from N to N that contains TBR + and TBR − operations.If the reverse statement of Lemma 4.2 would also hold, then we could shuffle the sequence such that consecutive TBR + and TBR − can be replaced with a TBR 0 .This would imply that uN n,r is an isometric subgraph of uN n under TBR.However, we now show that the reverse statement of Lemma 4.2 does not hold in general, and, hence, adjacent operations of different types in a shortest TBR-sequence cannot always be swapped.
Lemma 4.3.Let n ≥ 4 and r ≥ 2. Let N, N ∈ uN n with reticulation number r and r + 1 such that there exists a shortest TBR-sequence σ + = (N, M , N ) that starts with a TBR + .Then it is not guaranteed that there is a TBR-sequence σ 0 = (N, M, N ) that starts with a TBR 0 .
Proof.We claim that the networks N and N in Figure 8 are a pair of networks for which no TBR-sequence σ 0 = (N, M, N ) exists that starts with a TBR 0 .The two networks M 1 and M 2 in Figure 8 are the only two TBR − neighbours of N .However, it is easy to check that the TBR 0 -distance of N and M i , i ∈ {1, 2}, is at least two.Hence, a shortest TBR sequence from N to N that starts with a TBR 0 has length three and so σ 0 cannot exist.Note that we can add an edge to each of the pair of parallel edges to obtain an example without parallel edges.Moreover, the example can be extended to higher n and r by adding extra leaves between leaf 3 and 4, and replacing a pair of parallel edges by a chain of parallel edges in each network.However, there is no shortest TBR-sequence starting with a TBR 0 , since the networks M 1 and M 2 , which are the TBR − neighbours of N , have TBR 0distance at least two to N .
Note that the TBR 0 used in Figure 8 to prove Lemma 4.3 is a PR 0 .Hence, the statement of Lemma 4.3 also holds for PR.On the positive side, if one of the two networks is a tree, then we can swap the TBR + with the TBR 0 .Lemma 4.4.Let T ∈ uT n and N ∈ uN n with reticulation number one such that there exists a shortest TBR-sequence σ + = (T, N , N ) that starts with a TBR + .Then there is a TBR-sequence σ 0 = (T, T , N ) that starts with a TBR 0 .
Proof.We show how to obtain σ 0 from σ + .Suppose that N is obtained from T by adding the edge f and that N is obtained from N by removing e and adding e.Note that f is an edge of the cycle C in N .Furthermore, e and f are distinct.Indeed, otherwise there would be a shorter TBR-sequence from T to N that simply adds e to T .
Assume for now that e is an edge of C in N .Then, e can be removed with a TBR − from N to obtain a tree T .Hence, the TBR + from T to N and the TBR − from N to T can be merged into a TBR 0 from T to T .Furthermore, the edge e can then be added to T with a TBR + to obtain N .This yields the sequence σ 0 .
Next, assume that e is not an edge of C in N .Then, e is a cut-edge in N and e is a cut-edge in N .Let ē be the edge of T that equals e , if it exists, or the edge that gets subdivided by f into e and another edge.Let f be the edge of N defined as follows: it is equal to f itself if f is not touched by the TBR 0 move from N to N ; it is the extension of f if one of its endpoints is suppressed by this move; it is one of the two edges obtained by subdividing f .Now let T be a tree obtained by removing f from N .Then, there is a TBR 0 from T to T that moves ē to ē and furthermore a TBR + that adds f to T and yields N .We obtain again σ 0 .An example is given in Figure 9. : There is a TBR-sequence from T to N that first adds f with a TBR + and then moves e to e with a TBR 0 .From this, a TBR-sequence can be derived that moves ē to ē with a TBR 0 and then adds f with a TBR + .
Next, we look at shortest paths between a tree and a network.First, we show that if a network displays a tree, then there is a simple TBR − -sequence from the network to the tree.Recall that D(N ) is the set of trees in uT n displayed by N ∈ uN n .This result is the unrooted analogous to Lemma 7.4 by Bordewich et al. [BLS17] on rooted phylogenetic networks.
Lemma 4.5.Let N ∈ uN n,r and T ∈ uT n .Then T ∈ D(N ) if and only if d TBR (T, N ) = r, that is, iff there exists a TBR − -sequence of length r from N to T .
Proof.Note that d TBR (T, N ) ≥ r, since a TBR can reduce the reticulation number by at most one.Furthermore, if we apply a sequence of r TBR − moves on N , we arrive at a tree that is displayed by N .Hence, if T ∈ D(N ), then d TBR (T, N ) > r.
We now use induction on r to show that d TBR (T, N ) ≤ r if T ∈ D(N ).If r = 0, then T = N and the inequality holds.Now suppose that r > 0 and that the statement holds whenever a network with a reticulation number less than r displays T .Fix an embedding of T into N and colour all edges of N not covered by this embedding green.Note that removing a green edge with a TBR − might result in an improper network or a loop.Therefore, we have to show that there is always at least one edge that can be removed such that the resulting graph is a phylogenetic network.For this, consider the subgraph H of N induced by the green edges.If H contains a component consisting of a single green edge e, then removing e from N with a TBR − yields a network N .If H contains a tree component S, then it is easy to see that removing an external edge of S from N with a TBR − yields a network N .Otherwise, as N is proper, a component S displays a tree T S whose external edges cover exactly the external edges of S. We can then apply the same case distinction to the edges of S not covered by T S and either directly find an edge to remove or find further trees that cover the smaller remaining components.Since S is finite, we eventually find an edge to remove.The induction hypothesis then applies to N .This concludes the proof.
Note that the proof of Lemma 4.5 also works if T is a network displayed by N .Hence, we get the following corollary.
Corollary 4.6.Let N ∈ uN n,r and let N ∈ uN n,r such that N is displayed by N .Then d TBR (N , N ) = r − r , that is, there exists a TBR − -sequence of length r − r from N to N .Lemma 4.5 and Corollary 4.6 now allow us to construct TBR-sequences between networks that go down tiers and then come up again.In fact, for rooted networks this can sometimes be necessary as Klawitter and Linz have shown [KL19, Lemma 13].However, we now show that this is never necessary for TBR on unrooted networks.Lemma 4.7.Let N, N ∈ uN n .Then in no shortest TBR-sequence from N to N does a TBR − precede a TBR + .
Proof.Consider a minimal counterexample with N, N ∈ uN n such that there exists a shortest TBR-sequence σ from N to N that uses exactly one TBR − and TBR + and that starts with this TBR − .If σ uses TBR 0 operations between the TBR − and the TBR + , then, by Lemma 4.2, we can swap the TBR + forward until it directly follows the TBR − .However, then we can obtain a TBR-sequence shorter than σ by combining the TBR − and the TBR + into a TBR 0 by Observation 4.1; a contradiction.Combining Lemmas 4.2 and 4.5 and Corollary 4.6, we easily derive the following two corollaries about short sequences that do not go down tiers before going back up again.Both Corollaries 4.8 and 4.9 can easily be proven by first finding a sequence that goes down to tier 0 and back up to tier r, and then combining the r TBR − with r TBR + into r TBR 0 using Lemma 4.2.
The following lemma is the unrooted analogue to Proposition 7.7 by Bordewich et al. [BLS17].We closely follow their proof.Proof.The proof is by induction on k.If k = 0, then the statement trivially holds.Suppose that k = 1.If T ∈ D(N ), then set T = T , and we have d TBR (T, T ) = 0 ≤ 1.So assume otherwise, namely that T ∈ D(N ).Note that that if N has been obtained from N by a TBR + , then N displays T .Therefore, distinguish whether N has been obtained from N by a TBR 0 or TBR − σ.
Suppose that N has been obtained from N by a TBR 0 that moves the edge e = {u, v} of N .Fix an embedding S of T into N .Since N does not display T , the edge e is covered by S. Let ē be the edge of T that gets mapped to the path of S that covers e.Let S 1 and S 2 be the subgraphs of S \ {e}.Note that S 1 , S 2 have embeddings into N and N .Now, if in N there exists a path P from the embedding of S 1 to the embedding of S 2 that avoids e, then the graph consisting of P , S 1 , and S 2 is a tree T displayed by N .Otherwise e is a cut-edge of N and the TBR 0 moves e to an edge e connecting the two components of N \ {e}.Then in N there is path P from the embedding of S 1 to the embedding of S 2 in N .Together they form an embedding of a tree T displayed by N .In both cases T can also be obtained from T by moving ē to where P attaches to S 1 and S 2 .If N is obtained from N by a TBR − , then the first case has to apply.Now suppose that k ≥ 2 and that the hypothesis holds for any two networks with TBR-distance at most k − 1.By setting one of the two networks in the previous lemma to be a phylogenetic tree and noting that the roles of N and N are interchangeable, the next two corollaries are immediate consequences of Lemmas 4.5 and 4.10.The following theorem is the unrooted analogous of Theorem 7 by Klawitter and Linz [KL19] and their proof can be applied straightforward by swapping SNPR and rooted networks with TBR and unrooted networks, and by using Lemmas 4.5 and 4.10 and Theorem 6.1.

Connectedness and diameters
Whereas in the previous section we studied the maximum distance between two given networks, here, we focus on global connectivity properties of several classes of phylogenetic networks under NNI, PR, and TBR.These results imply that these operations induce metrics on these spaces.For each connected metric space, we can ask about its diameter.Since a class of phylogenetic networks that contains networks with unbounded reticulation number naturally has an unbounded diameter, this questions is mainly of interest for the tiers of a class.First, we recall some known results from unrooted phylogenetic trees.

Network space
Huber et al. [HMW16, Theorem 5] proved that the space of phylogenetic networks that includes improper networks is connected under NNI.We reprove this for our definition of uN n , but first look at the tiers of this space.
Let n ≥ 0, r ≥ 0, and m = n + r.Then uN n,r is connected under NNI with the diameter in Θ(m log m).
Proof.Let N ∈ uN n,r and let T ∈ uT n be a tree displayed by N .We show that N can be transformed into a sorted r-handcuffed caterpillar N * with O(m log m) NNI.Our process is as follows and illustrated in Figure 10.
Step 1. Transform N into a network N T that is tree-based on T .
Step 2. Transform N T into handcuffed tree N H on the leafs 1 and 2. Step 3. Transform N H into a sorted handcuffed caterpillar N * .
We now describe this process in detail.For Step 1, we show how to construct an NNI 0sequence σ from N to N T , and we give a bound on the length of σ.Let S be an embedding of T into N , that is, S is a subdivision of T and a subgraph of N .Colour all edges of N used by S black and all other edges green.Note that this yields green, connected subgraphs G 1 , . . ., G l of N ; more precisely, the G i are the connected components of the graph induced by the green edges of N .Note that each G i has at least two vertices in S, since otherwise N would not be proper.Furthermore, if each G i consists of a single edge, then N is tree-based on T .Assuming otherwise, we show how to break the G i apart.
First, if there is a triangle on vertices v 1 , u, v 2 where v 1 and v 2 are adjacent vertices in S and u is their neighbour in G i , then change the embedding of S (and T ) so that it takes the path v 1 , u, v 2 instead of v 1 , v 2 (see Figure 11a).Otherwise, there is an edge {v, u} where v is in S and the other vertices adjacent to u are not adjacent to v. Let {u, w 1 } and {u, w 2 } be the other edges incident to u. Apply an NNI 0 to move {u, w 1 } to S as in Figure 11b.Note that each such NNI 0 decreases the number of vertices in green subgraphs and increases the number of vertices in S. Furthermore, the resulting networks is clearly proper.Therefore, repeat these cases until all G i consist of single edges.Let the resulting graph be N T .Since there are at most 2(r − 1) vertices in all green subgraphs that are not in S, the number of required NNI 0 for Step 1 is at most 2(r − 1).
(1) In Step 2 we transform N T into a handcuffed tree N H on the leaves 1 and 2. Let M = {{u 1 , v 1 }, {u 2 , v 2 }, . . ., {u r , v r }} be the set of green edges in N T , that is, the edges that are not in the embedding S of T into N T .Without loss of generality, assume that for i ∈ {1, . . ., r} the distance between u i and leaf 1 in S is at most the distance of v i to leaf 1 in S. The idea is to sweep along the edges of S to move the u i towards leaf 1 and then do the same for the v i towards leaf 2.
For an edge e of T , let P e be the path of S corresponding to e. Let e 1 be the edge of T incident to leaf 1. Impose directions on the edges of T towards leaf 1. Do the same for the edges of S accordingly.This gives a partial order on the edges of T with e 1 as maximum.Let ≺ be a linear extension of on the edges of T .
Let e = (x, y) be the minimum of ≺.Let P e = (x, . . ., y) be the corresponding path in S. From x to y along P e , proceed as follows.
(i) If there is an edge (u i , v l ) in P e , then swap u i and v l with an NNI 0 .
(ii) If there is an edge (u i , u j ) in P e then move the u j endpoint of the green edge incident to u j onto the green edge incident to u i with an NNI 0 .
(iii) Otherwise, if there is an edge (u i , y) in P e , then move u i beyond y.
This is illustrated in Figure 12.Informally speaking, we stack u j onto u i so they can move together towards e 1 .Repeat this process for each edge in the order given by ≺.
For the last edge e 1 , ignore case (iii).Next "unpack" the stacked u i 's on e 1 .We now count the number of NNI 0 needed.Firstly, each v l is swapped at most once with a u i .Secondly, each u j is moving to and from a green edge at most once.Furthermore, each vertex of S corresponding to a vertex of T is swapped at most twice.Hence, the total number of NNI 0 required is at most 3r + 2n.
(2) Repeat this process for the v i towards leaf 2. Since the v i do not have to be swapped with u j , the total number of NNI 0 required for this is at most 2r + 2n. (3) Note that the resulting network may not yet be a handcuffed tree as the order of the u i and v j may be different.Hence, lastly in Step 2, to obtain N H sort the edges with the mergesort-like algorithm by Li et al. [LTZ96,Lemma 2].They show that the required number of NNI 0 for this is at most For Step 3, consider the path P in S from leaf 1 to 2. If P contains only one pendant subtree, then N H is handcuffed on the cherry {1, 2}.Otherwise, use NNI 0 to reduce it to one pendant subtree.This takes at most n NNI 0 .Next, transform the pendant subtree of P into a caterpillar to obtain a handcuffed caterpillar, again with at most n NNI 0 .Lastly, sort the leaves with the algorithm from Li et al. [LTZ96,Lemma 2] to obtain the sorted handcuffed caterpillar N * .The required number of NNI 0 to get from N H to N * is at most 2n + n log n. (5) Since we can transform any network N ∈ uN n,r into N * , it follows that uN n,r is connected under NNI.Furthermore, adding Equations (1) to (5) up and multiplying the result by two shows that the diameter of uN n,r under NNI 0 is at most Francis et al. [FHMW18, Theorem 2] gave the lower bound Ω(m log m) on the diameter of tier r of the space that allows improper networks under NNI 0 improper (NNI 0 without the properness condition).Their proof consists of two parts: a lower bound on the total number of networks in a tier |uN n,r |, and upper bounds on the number of networks that can be reached from one network for each fixed number of NNI 0 improper .The diameter of uN n,r is at least the smallest number of moves needed for which previously mentioned upper bound is greater than the lower bound on |uN n,r |.
Our version of NNI 0 is stricter than theirs as we do not allow improper networks.Hence, the number of networks that can be reached with a fixed number of NNI 0 is at most the number of networks that can be reached with the same number of NNI 0 improper .Furthermore, their lower bound on |uN n,r | is found by counting the number of Echidna networks, a class of networks only containing proper networks.Combining these two observations, we see that their lower bound for the diameter of uN n,r under NNI 0 improper is also a lower bound for uN n,r under NNI 0 .From Theorem 5.2 we get the following corollary.
Corollary 5.3.The space uN n is connected under NNI with unbounded diameter.
Since, by Observation 3.1, every NNI is also a PR and TBR, the statements in Theorem 5.2 and Corollary 5.3 also hold for PR and TBR.This observation has been made before by Francis et al. [FHMW18] for tiers of the space of networks that allow improper networks.
Corollary 5.4.The spaces uN n and uN n,r are connected under the PR and TBR operation.
We now look at the diameters of uN n,r under PR and TBR.
Theorem 5.5.Let n ≥ 0, r ≥ 0. Then the diameter of uN n,r under PR 0 is in Θ(n + r) with the upper bound n + 2r.
Proof.The asymptotic lower bound was proven by Francis et al. [FHMW18,Proposition 4].Concerning an upper bound, Janssen et al. [JJE + 18, Theorem 4.22] showed that the distance of two improper networks M and M under PR is at most n + 8 3 r, of which 2 3 r PR 0 moves are used to transform M and M into proper networks N and N .Hence, the PR-distance of N and N is at most n + 2r.
Theorem 5.6.Let n ≥ 0, r ≥ 0. Then the diameter of uN n,r under TBR is in Θ(n + r) with the upper bound Proof.Like for PR, the lower bound was proven by Francis et al. [FHMW18,Proposition 4].In Corollary 4.8 we show that the TBR-distance of two networks N and N ∈ uN n,r that display a tree T and T ∈ uT n , respectively, is at most by Theorem 1.1 of Ding et al. [DGH11] it follows that

Networks displaying networks
Bordewich [Bor03, Proposition 2.9] and Mark et al. [MMS16] showed that the space of rooted phylogenetic trees that display a set of triplets (trees on three leaves) is connected under NNI.Furthermore, Bordewich et al. [BLS17] showed that the space of rooted phylogenetic networks that display a set of rooted phylogenetic trees is connected.We give a general result for unrooted phylogenetic networks that display a set of networks.
For this, we will use Lemma 4.5, which, as we recall, guarantees that if a network N ∈ uN n,r displays a tree T ∈ uT n , then there is a sequence of r TBR − from N to T .
Then uN n (P ) is connected under NNI, PR, and TBR.
Proof.Define the network N P ∈ uN n (P ) as follows.Let P 0 ∈ uT n be the caterpillar where the leaves are ordered from 1 to n; that is, P 0 contains a path (v 2 , v 3 , . . ., v n−1 ) such that leaf i is incident to v i , leaf 1 is incident to v 2 , and leaf n is incident to v n−1 .Let e i be the edge incident to leaf i in P 0 .Subdivide e i with k vertices u 1 i , . . ., u k i .Now, for P j ∈ P , j ∈ {1, . . ., k}, identify leaf i of P j with u j i of P 0 and remove its label i.Finally, in the resulting network suppress any degree two vertex.This is necessary if one or more of the P j have fewer than n leaves.The resulting network N P now displays all networks in P .An example is given in Figure 13.
Let N ∈ uN n (P ).Construct a TBR-sequence from N to N P by, roughly speaking, building a copy of N P attached to N , and then removing the original parts of N .First, add P 0 to N by adding an edge e = {v 1 , v 2 } from the edge incident to leaf 1 to the edge incident to leaf 2 with a TBR + .Then add another edge from e to the edge incident to leaf 3, and so on up to leaf n.Colour all newly added edges and the edges incident to the leaves blue, and all other edges red.Note that the blue edges now give an embedding of P 0 into the current network.Now, ignoring all red edges, it is straight forward to add the P j , j ∈ {1, . . ., k} one after the other with TBR + such that the resulting network displays N P .For example, one could start by adding a tree displayed by P j and then adding any other edges.The first part works similar to the construction of P 0 and the second part is possible by Lemma 4.5.Lastly, remove all red edges with TBR − such that every intermediate network is proper.This is again possible by Lemma 4.5 and yields the network N P .Note that in the first two stages the red edges (plus external edges) display P and in the last phase the non-red edges display P .Since we only used TBR + and TBR − operations, the statement also holds for PR.For NNI, by Lemma 3.5 we can replace each of these operations that add or remove an edge e by NNI-sequences that only move and remove or add the edge e.Hence, the statement also holds for NNI.
For the following corollary, note that a quartet is an unrooted binary tree on four leaves and a quarnet is an unrooted binary, level-1 network on four leaves [HMSW18].
Corollary 5.8.Let X = {1, ..., n}.Let P be a set of phylogenetic trees on X, a set of quartets on X, or a set of quarnets on X.Then uN n (P ) is connected under NNI, PR, and TBR.

Tree-based networks
A related but more restrictive concept to displaying a tree is being tree-based.So, next, we consider the class of tree-based networks.We start with the tiers of uT B n (T ), which transformed into N by transforming T into T with O(n log n) NNI 0 or with O(n) PR 0 or TBR 0 .With Theorem 5.9, the connectedness of uT B n,r and the upper bounds on the diameter follow.The lower bound on the diameter under PR and TBR also follows from Theorem 5.1 and Theorem 5.9, Lastly, the connectedness of uT B n follows similarly from the connectedness of uT n and uT B n,r .

Level-k networks
To conclude this section, we prove the connectedness of the space of level-k networks.
Theorem 5.11.Let n ≥ 2 and k ≥ 1.Then, the space uLV-k n is connected under TBR and PR with unbounded diameter.
Proof.Let N ∈ uLV-k n and T ∈ uT n .We show that N can be transformed into the network M ∈ uLV-k n that can be obtained from T by adding a k-burl to the edge incident to leaf 1.First, create a k-burl in N on the edge incident to leaf 1.This can be done using k PR + .Next, using Lemma 4.5 remove all other blobs.This gives a network M which consists of a tree T with a k-burl at leaf 1.There is a PR 0 -sequence from T to T , which is easily converted into a sequence from M to M .This proves the connectedness of uLV-k n under PR and also TBR.Lastly, note that the diameter is unbounded because the number of possible reticulations in a level-k network is unbounded.
Note that an NNI + cannot directly create a pair of parallel edges.We may instead add a triangle with an NNI + and then use an NNI 0 to transform it into a pair of parallel edges.However, adding the triangle within a level-k blob of a level-k network, then adding the triangle would increase the level.Therefore, to prove connectedness of level-k networks under NNI we use the same idea as for PR but are more careful to not increase the level.
Theorem 5.12.Let n ≥ 3 and k ≥ 1.Then, the space uLV-k n is connected under NNI with unbounded diameter.
Proof.Let N ∈ uLV-k n and let T ∈ uT n .Like in the proof of Theorem 5.11, we want to transform N into a network M obtained from T by adding a k-burl to the edge incident to leaf 1.
Let B be a level-k blob of N .Assume that N contains another blob B .By Lemma 4.5 there is a PR + -sequence that can remove B .Use Lemma 3.5 to substitute this sequence with an NNI-sequence that reduces B to a level-1 blob.Note that this can be done locally within blob B and its incident edges.Therefore, this process does not increase the level of a network along this sequence.If B is now a cycle of size at least three, then we can shrink it to a triangle, if necessary, and remove it with an NNI − .If B is a pair of parallel edges and one of its vertices is incident to a degree three vertex v that is not part of a level-k blob, then use an NNI 0 to increase the size of B into a triangle by including v or merge it with the blob containing v. Next, either remove the resulting triangle, or repeat the process above to remove the new blob.Otherwise, ignore B for now and continue with another blob of the current network that is neither B nor B. When this process terminates, we arrive at a network that has only blob B, and, potentially, pairs of parallel edges that are incident to both B and a leaf.That is the case since a pair of parallel edges incident to a degree three vertex not in B could be removed with an NNI 0 and an NNI − .
If the edge incident to leaf 1 contains a pair of parallel edges or is incident to a degree three vertex not in B, then use k − 1 NNI + and NNI 0 (or k in the latter case) to create a k-burl next to leaf 1.Otherwise, if B is incident to three or more cut-edges, then one of them is not incident to leaf 1 and can be moved to the edge incident to leaf 1 with an NNI 0 -sequence.If B is incident to two or fewer cut-edges, there is a vertex incident to three cut edges (since n ≥ 3) and one of them can be moved to the edge incident to leaf 1 with an NNI 0 -sequence.Then apply the first case again to create a k-burl.Finally, remove B and any remaining pair of parallel edges.This gives a network M which consists of a tree T with a k-burl at leaf 1.There is an NNI 0 -sequence from T to T , which is easily converted into a sequence from M to M .Lastly, note that the diameter is unbounded because for each r ≥ 0, there is a level-k network with r reticulations.

Isometric relations between spaces
Recall that a space C n is an isometric subgraph of uN n under a rearrangement operation, say TBR, if the TBR-distance of two networks in C n is the same as their TBR-distance in uN n .In this section, we investigate this question for uT n under TBR, and for tree-based networks and level-k networks under TBR and PR.
We start with uT n .The proof of the following theorem follows the proof by Bordewich et al. [BLS17,Proposition 7.1] for their equivalent statement for SNPR on rooted phylogenetic trees and networks closely.
Theorem 6.1.The space uT n is an isometric subgraph of uN n under TBR.Moreover, every shortest TBR-sequence from T ∈ uT n to T ∈ uT n only uses TBR 0 .Proof.Let d T and d N be the TBR-distance in uT n and uN n respectively.To prove the statement, it suffices to show that d T (T, T ) = d N (T, T ) for every pair T, T ∈ uT n .Note that d T (T, T ) ≥ d N (T, T ) holds by definition.To prove the converse, let σ = (T = N 0 , N 1 , . . ., N k = T ) be a shortest TBR-sequence from T to T .Consider the following colouring of the edges of each N i , for i ∈ {0, . . ., k}.Colour all edges of T = N 0 blue.For i ∈ {1, . . ., k} preserve the colouring of N i−1 to a colouring of N i for all edges except those affected by the TBR.In particular, an edge that gets added or moved is coloured red, an edge resulting from a vertex suppression is coloured blue if the two merged edges were blue and red otherwise, and the edges resulting from an edge subdivision are coloured like the subdivided edge.
Let F i be the graph obtained from N i by removing all red edges.We claim that F i is a forest with at most k + 1 components.Since F 0 = T , the statement holds for i = 0.If N i is obtained from N i−1 by a TBR + , then F i = F i−1 .If N i is obtained from N i−1 by a TBR 0 or TBR − , then at most one component gets split.Note that F k is a so-called agreement forest for T and T and thus d T (T, T ) ≤ k = d N (T, T ) by Theorem 2.13 by Allen and Steel [AS01].Furthermore, if σ would use a TBR + , then the forest F k would contain at most k components.However, then d T (T, T ) < k; a contradiction.Francis et al. [FHMW18] gave the example in Figure 14 to show that the tiers uN n,r for n ≥ 5 and r > 0 are not isometric subgraphs of uN n under NNI.Their question of whether tier zero, uT n , is an isometric subgraph of uN n under NNI remains open.Lemma 6.2.Let n ≥ 5 and r ≥ 0. Then the space uN n,r is not an isometric subgraph of uN n under NNI.For n = 4 and r = 13 the space uN n,r is not an isometric subgraph of uN n under PR.
Proof.For the networks N and N in uN n,r shown in Figure 15 there is a length three PR-sequence that traverses tier r + 1, for example, like the depicted sequence σ = (N = N 0 , N 1 , N 2 , N 3 = N ).To prove the statement we show that every PR 0 -sequence from N to N has length at least four.The networks N and N contain the highlighted (sub)blobs B 1 , B 2 , (resp.B 1 and B 2 ), B 3 , and B 4 .Observe that the edges between B 1 and B 2 and between B 3 and B 4 may only be pruned from a blob by a PR 0 if they get regrafted to the same blob again.Otherwise the resulting network is improper.Note that to derive B 1 from B 1 an edge has to be regrafted to the "top" of B 1 and the edge to B 2 has to be pruned.By the first observation, combining these into one PR 0 cannot build the connection to B 3 .The same applies for the transformation of B 2 into B 2 and its connection to B 4 .Therefore, we either need four PR 0 to derive B 1 and B 2 or two PR 0 plus two PR 0 to build the connections to B 3 and B 4 .In conclusion, at least four PR 0 are required to transform N into N , which concludes this proof.
By replacing a leaf with a tree, and adding more pairs of parallel edges to edge leading to 4, this example can be made to work for n ≥ 4 and r ≥ 13.
Figure 15: A length three PR-sequence from N to N that uses a PR + , which adds f , a PR 0 , which moves e, and a PR − , which removes e .A PR 0 -sequence from N to N has length at least four.
For n ≥ 6 the space uT B n is not an isometric subgraph of uN n under TBR and PR.
Proof.Let N be the network in Figure 16.Let N be the network derived from N by swapping the labels 1 and 2. Note that d TBR (N, N ) = d PR (N, N ) = 2, since, from N to N , we can move leaf 2 next to leaf 1 and then move leaf 1 to where leaf 2 was.However, then the network in the middle is not tree-based, since the blob derived from the Petersen graph has no Hamiltonian path if the two pendent edges of the blob are next to each other [FHM18].We claim that there is no other length two TBR-sequence from N to N .For this proof we call a blob derived from the Petersen graph a Petersen blob.First, note that the TBR 0 -sequence of N and N is at least two and there is thus no TBR-sequence that consists of a TBR − and a TBR + .Otherwise, these two operations could be merged into a single TBR 0 by Observation 4.1.Note that we can only move leaf 1 or 2 by pruning an incident edge if we do not affect the split 1 versus 2, 3 or break the tree-based property.Therefore, they either have to be swapped using edges of the Petersen blobs or the (4, 5, 6)-chain has to be reversed and leaf 3 moved to the other Petersen blob.However, it is straightforward to check that neither can be done with two TBR 0 .In particular, we can look at what edge the first TBR 0 might move and then check whether a second TBR 0 can arrive at N .If the first TBR 0 breaks a Petersen blob, the problem is that the second TBR 0 has to restore it.We then find that this does not allows us to make the initially planned changes to arrive at N .On the other hand, if we avoid breaking the Petersen blob and reverse the (4, 5, 6)-chain, then leaf 3 is still on the wrong side; and if we move leaf 3 to the other Petersen blob, then not enough TBR 0 moves remain to reverse the chain.
Since there is no other length two TBR 0 -sequence there is also no other length two PR-sequence.Theorem 6.5.For n ≥ 5 and large enough k, the space uLV-k n is not an isometric subgraph of uN n under TBR and PR.
Proof.For even k, the networks N and N in Figure 17 have TBR-and PR-distance two via the network M .However, note that in M the blobs of size k 2 + 1 a k 2 are merged into a blob of size k + 1.Therefore, M is not a level-k network.We claim that there is no TBR-or PR-sequence of length two that does not go through a level-(k + 1) network like M .An example for odd k can be derived from this.
Figure 17: For even k, a 0 -sequence from a level-k network N to a level-k network N (hidden reticulations of the blob-parts given inside, at least two leaves ommited: in B 1 and in B 3 ).However, the network M in the middle is a level-(k + 1) but not a level-k network.
It is easy to see that the TBR-distance of N and N is at least two and there is thus no TBR-sequence that consists of a TBR − and a TBR + .Otherwise, these two operations could be merged into a single TBR 0 by Observation 4.1.We thus have to prove that there is no length two TBR 0 -sequence from N to N that avoids a level-(k + 1) network.Note that it requires two TBR 0 (or PR 0 ) to connect B 2 and B 3 into B 2 .Similarly, it requires either two prunings from the upper five-cycle of B 2 to obtain the triangle B 3 or one pruning within that cycle.However, in the latter option this would not contribute to connecting B 2 and B 3 and hence overall at least three operations would be needed.Therefore we have to combine the two operations necessary to create B 2 and to create B 3 , which however gives us a sequence like the one shown in Figure Note that the results of this section that show that the spaces of tree-based networks and level-k networks are not isometric subgraphs of the space of all networks also hold if we restrict these spaces to a particular tier r (for large enough r).
proper network obtained from N by suppressing all degree two nodes.The instance of the PR-distance decision problem consists of N , T = T , and the reticulation number r of N .As we can compute in polynomial time whether a cut edge separates two labelled leaves, the reduction is polynomial time.Because a displayed tree uses only cut-edges that separate two labelled leaves, T is displayed by N if and only if it is displayed by N .By Lemma 4.5, T is a displayed tree of N , if and only if d PR (N , T ) ≤ r, which concludes the proof.
Unlike for the hardness proof of TBR-distance, we cannot readily adapt this proof to the PR-distance in uN n,r .For this purpose, we need to learn more about the structure of PR-space.

Concluding remarks
In this paper, we investigated basic properties of spaces of unrooted phylogenetic networks and their metrics under the rearrangement operations NNI, PR, and TBR.We have proven connectedness and bounds on diameters for different classes of phylogenetic networks, including networks that display a particular set of trees, tree-based networks, and level-k networks.Although these parameters have been studied before for classes of rooted phylogenetic network [BLS17], this is the first paper that studies these properties for classes of unrooted phylogenetic networks besides the space of all networks.A summary of our results is shown in Table 1.
To see the improvements in diameter bounds, we compare our results to previously found bounds: For the space of phylogenetic trees uT n it was known that the diameter is asymptotically linearithmic and linear in the size of the trees under NNI and SPR/TBR [LTZ96,DGH11], respectively.Here, we have shown that the diameter under NNI is also asymptotically linearithmic for higher tiers of phylogenetic networks.Whether this also holds in the rooted case is still open.We have further (re)proven the asymptotic linear diameter for PR and TBR of these tiers and, in particular, improved the upper bound on the diameter under TBR to n − 3 − To uncover local structures of network spaces, we looked at properties of shortest sequences of moves between two networks.Here we found that shortest TBR-sequences between networks in the same tier never traverse lower tiers, and shortest TBR-sequences between trees also never traverse higher tiers.This implies that uT n is an isometric subgraph of uN n , and that computing the TBR-distance between two networks in uN n is NP-hard.This answers a question by Francis et al. [FHMW18].We have attempted to prove similar results for other subspaces and rearrangement moves.However, for higher tiers, we have not been able to prove that shortest TBR-sequences never traverse higher tiers.To answer this question we may need to utilise agreement graphs such as frequently used for phylogenetic trees [AS01, BS05] and, more recently, also for rooted phylogenetic networks [KL19,Kla19].Concerning NNI and PR we gave counterexamples to prove that higher tiers are not isometric subgraphs of uN n .The questions whether uT n is isometrically embedded in uN n under PR and NNI remains open.Answering these questions positively would also provide an answer to the question whether computing the shortest NNI-distance between two networks is NP-hard, and clues toward proving whether the PR-distance between two networks in the same tier is NP-hard.Further negative results that we have shown are that the spaces of tree-based networks and level-k are not isometric subgraphs of the space of all phylogenetic networks.Throughout this paper, we have restricted our attention to proper networks.We could also have chosen to use unrooted networks without the properness condition.This definition, which is mathematically more elegant, is used in most other papers, so it seems to be the obvious choice.However, it is not natural to have cut-edges that do not separate leaves: such networks carry no biological meaning.It is desirable that networks are rootable and thus have an evolutionary interpretation.Unrooted phylogenetic networks are rootable if they have at most one blob with one cut-edge.While using this in the definition of an unrooted phylogenetic network could therefore be sufficient, we go one step further, and ask that there is no such blob.This makes a network rootable at any leaf (i.e., with any taxon as out-group), which gives a stronger biological interpretation and usability.
The fact that our definition of unrooted phylogenetic networks is mathematically more restrictive, means that any positive result we have proven is likely also true when using a less restrictive definition.That is, connectedness for those definitions follows easily by finding sequences to proper networks, like done by Jansen et al. [JJE + 18].As we may be able to find short sequences for this purpose, the diameter results will likely also still hold.This means that whatever definitions may be used in practice, with minor additional arguments, our results provide the theoretical background necessary to justify local search operations.

Figure 1 :
Figure1: The three rearrangement operations on unrooted phylogenetic trees: The NNI from T 1 to T 2 changes the order of the four edges incident to e; the SPR from T 2 to T 3 prunes the edge e , and then regrafts it again; and the TBR from T 3 to T 4 first removes the edge e , and then reconnects the resulting two trees with a new edge.Note that every NNI is also an SPR and every SPR is also a TBR but not vice versa.

Figure 2 :
Figure2: An unrooted, binary phylogenetic tree T ∈ uT 6 and an unrooted, binary proper phylogenetic network N ∈ uN 6 .The unrooted, binary phylogenetic network M is improper since the cut-edge e does not lie on a path that connects two leaves.
TBR.A TBR operation 2 is the rearrangement operation that transforms a network N ∈ uN n into another network N ∈ uN n in one of the following four ways: (TBR 0 ) Remove an internal edge e of N , subdivide an edge of the resulting graph with a new vertex u, subdivide an edge of the resulting graph with a new vertex v, and add the edge {u, v}; or, prune an external edge e = {u, v} of N that is incident to leaf v at u, regraft {•, v} to an edge of the resulting graph.(TBR + ) Subdivide an edge of N with a new vertex u, subdivide an edge of the resulting graph with a new vertex v, and add the edge e = {u, v}.(TBR − ) Remove an edge e of N .

Figure 4 :
Figure4: Illustration of the TBR operation.The network N 2 can be obtained from N 1 by a TBR 0 that moves the edge {u, v} and the network N 3 can be obtained from N 2 by a TBR + that adds the edge {u , v }.Each operation has its corresponding TBR 0 and TBR − operation, respectively, that reverses the rearrangement.

Figure 5 :
Figure5: Illustration of the NNI operation.The network N 2 (resp.N 3 ) can be obtained from N 1 (resp.N 2 ) by an NNI 0 with the axis {u, v}; alternatively, N 2 can be obtained from N 1 using the NNI 0 of {1, u} to the triangle, and N 3 from N 2 by moving {1, u} to the bottom edge of the square.The labels are inherited naturally following the first interpretation of the NNI 0 moves.The network N 4 can be obtained from N 3 by an NNI + that extends x into a triangle.Each operation has its corresponding NNI 0 and NNI − operation, respectively, that reverses the transformation.

Figure 6 :
Figure 6: How to mimic the PR 0 that prunes the edge {u, v} at u and regrafts to {x, y} with NNI 0 operations that move u of {u, v} along the path P = (u = v 0 , v 1 , v 2 = x) (for the proof of Lemma 3.4).Labels follow the definition of NNI 0 along an axis.

Figure 7 :
Figure 7: How an NNI 0 in the proof of Lemma 3.4 may result an improper network where e c separates a blob B from all leaves.The moving edge {v, v i } of N i becomes the moving edge {v, v i+1 } of N i+1 .Labels follow the definition of NNI 0 along an axis.

Figure 8 :
Figure8: Two networks N, N ∈ uN n with TBR-distance two such that there exist a shortest TBR-sequence from N to N starting with a TBR + move (to M ).However, there is no shortest TBR-sequence starting with a TBR 0 , since the networks M 1 and M 2 , which are the TBR − neighbours of N , have TBR 0distance at least two to N .
Figure9: There is a TBR-sequence from T to N that first adds f with a TBR + and then moves e to e with a TBR 0 .From this, a TBR-sequence can be derived that moves ē to ē with a TBR 0 and then adds f with a TBR + .

Corollary 4. 8 .
Let N, N ∈ uN n with reticulation number r and r , with r ≥ r .Thend TBR (N, N ) ≤ min{d TBR (T, T ) : T ∈ D(N ), T ∈ D(N )} + r.Corollary 4.9.Let N, N ∈ uN n with reticulation number r and r , and r ≥ r .Let T ∈ uT n such that T ∈ D(N ), D(N ).Then d TBR (N, N ) ≤ r.
Lemma 4.10.Let N, N ∈ uN n such that d TBR (N, N ) = k.Let T ∈ D(N ).Then there exists a T ∈ D(N ) such that d TBR (T, T ) ≤ k.
Let N ∈ uN n such that d TBR (N, N ) = k − 1 and d TBR (N , N ) = 1.Thus by induction there are trees T and T such that T ∈ D(N ) with d TBR (T, T ) ≤ k − 1 and T ∈ D(N ) with d TBR (T , T ) ≤ 1.It follows that d TBR (T, T ) ≤ k, thereby completing the proof of the lemma.

Figure 10 :
Figure10: The process used in the proof of Theorem 5.2.We transform a network N into a tree-based network N T , then into a handcuffed tree N H , and finally into a sorted handcuffed caterpillar N * .

Figure 11 :
Figure 11: Transformation and NNI 0 used in Step 1 to obtain a tree-based network N T .

Figure 12 :
Figure 12: NNI 0 used in Step 2 to obtain a hand-cuffed tree N H .The label of the moving endpoint follows this endpoint to its regrafting point.

Figure 13 :
Figure13: The canonical network N P ∈ uN 5 that displays the set of phylogenetic networks P = (P 1 , P 2 ) with the underlying caterpillar P 0 .

Figure 14 :
Figure14: An NNI-sequence from N to N using an NNI + that adds f , an NNI 0 that moves e, and an NNI − that removes e .A shortest NNI 0 -sequence from N to N has length three.

Figure 16 :
Figure 16: A tree-based network on the left and a Hamiltonian path through a blob derived from the Petersen graph on the right.

Table 1 :
Connectedness and diameters, if bounded, for the various classes and rearrangement operations.Here m = n + r, P is a set of phylogenetic networks, and T ∈ uT n .