Homework 8: TSP

0. Background


A. Goals

The purpose of the Travelling Salesman Problem assignment is to practice implementing linked lists. The specific goals are to:

B. Background

A travelling salesman needs to visit each of n cities exactly once, and arrive back home, keeping the total distance travelled as short as possible. In this assignment, you will write a program to find a path connecting n points that passes through each point exactly once.

The travelling salesman problem is a notoriously difficult combinatorial optimization problem. There does not exist an efficient algorithm to find the optimal tour, the tour of smallest distance. The only way to find the optimal tour is to use brute force: to compute the distance of all possible tours. The problem with this is that there are n! (n factorial) possible tours; enumerating them all and computing their distance would be very slow.

However, there are efficient ways to find a tour that is near-optimal; these methods are called heuristics. You will implement two heuristics to find good (but not optimal) solutions to the traveling salesman problem.

1000 points in the plane optimal tour
1,000 points optimal tour through the same 1,000 points

The travelling salesman problem has a wealth of applications such as vehicle routing, circuit board drilling, circuit board design, robot control, X-ray crystallography, machine scheduling, and computational biology.

C. Your Program

In this assignment, you will write a Tour class that models a tour by a linked list of Points.

You will implement the following methods to insert points into a tour.

D. Designing the Requirements and Interface

Download data.zip and decompress it into your folder for this homework assignment. This zip file contains data files you can use for testing as well as some helper classes and interfaces you will need:

Review the class material and textbook chapters on linked lists.

1. Tour and Point Classes


A. The Point Class

data.zip contains the Point class that represents a point in a tour. Open it in DrJava and study it carefully. The API is as follows:

public class Point
------------------------------------------------------------------------------------------------------
       Point(double x, double y)    // create the Point (x, y)
String toString()                   // return String representation
  void draw()                       // draw Point using PennDraw
  void drawTo(Point that)           // draw line segment between this Point and that
double distanceTo(Point that)       // return Euclidean distance between this Point and that

B. The Tour Class

Create a skeleton for your Tour class, which must implement TourInterface:

public interface TourInterface
-----------------------------------------------------------------------------------------------------------
String toString()                 // create a String representation of the Tour
  void draw(Point p)              // draw the Tour using PennDraw
                                  // any edge starting or ending at p should be in a distinct color
   int size()                     // number of Points in this Tour
double distance()                 // return the total distance of the Tour
  void insertInOrder(Point p)     // insert p at the end of the Tour
  void insertNearest(Point p)     // insert p using the nearest neighbor heuristic
  void insertSmallest(Point p)    // insert p using the smallest increase heuristic

Write method stubs for each method declaration in the TourInterface interface. The stubs must each return a dummy value if necessary so that Tour.java compiles.

Add appropriate header comments and method comments.

C. Linked List Node Class

Your Tour class must use a linked list using a private inner class Node. Each Node holds a reference to a Point.

You can create an inner class by declaring a class inside another. For instance, one can write

class OuterClass {
    // ...
    class InnerClass {
        // ...
    }
}

Having an inner classes is useful when the class is only used in a certain context, leading to more encapsulation and greater code maintainability.

Just like fields, inner classes can be declared as public or private. private inner classes can only be used within the surrounding class. Inner classes are able to access the fields of its surrounding class. However, because inner class instances must be associated with an instance of the outer class, inner classes cannot declare static members, and they cannot be instantiated independently of the instance of the outer class.

Declare a private field head in your Tour class to hold the first Node in the Tour.

Your linked list should contain two Nodes that contain the first Point of the Tour – one of these Nodes at the head, and the other Node at the end. Both of these two Nodes must refer to the same Point object (as opposed to them referring to distinct Point object instances with the same field values). This is a required implementation detail.

You may not use Java's built-in LinkedList class to implement your linked list.

D. Constructor and toString()

Implement a single constructor for your Tour class that takes no arguments and creates an empty Tour.

toString() returns a String representation of the Tour (the first Point should show up at the end as well, just like it does in your linked list structure). Call toString() on each Point to get a String representation of the Point. Add a line break character, '\n', after each Point. Your output must match this description exactly in order to pass our grading scripts.

If the Tour is empty (has no Points), toString() should return the empty String.

Required Testing: Add a main that creates an empty tour and prints it out. Your program should now simply print a blank line. Once this works, submit and make sure you also pass the empty tour submission test.

E. insertInOrder()

To facilitate testing, you will need to implement insertInOrder() so you can add Points to your tour.

insertInOrder(Point p) adds the Point p as the last Point of the Tour.

Remember that your Tour class should maintain one Node at the end of the linked list referring to the first Point.

If the Tour is empty, make sure that after this method finishes, your linked list contains two Node objects, both referring to the same Point.

If you wish, you may write a helper function that adds a given Point after a given Node. (Helper functions should be declared private.)

Required Testing: Add code to main to create the following four points and add them to your tour using insertInOrder():

The following image shows the structure of the link list these insertions should create:

four-point tour visualization

Print out the tour using System.out.println You should see the following output (including a blank line at the end):

(0.0, 0.0)
(1.0, 0.0)
(1.0, 1.0)
(0.0, 1.0)
(0.0, 0.0)

Once this test passes, submit your code and make sure it passes our toString() tests for insertInOrder() as well. (At this point, your code will still fail the tests for size() and distance().)

F. Utility Methods

Implement the size(), distance(), and draw() methods of Tour. There are many good ways to implement these methods, using for loops, while loops, or recursion. The choice is up to you.

size() returns the number of Points in the Tour, (without including the first Point twice).

distance() returns the total length of the Tour from Point to Point. Use the distanceTo(Point p) method of a Point to find its distance to p. An empty Tour has a distance of 0.0.

draw(Point p) draws the Tour from Point to Point using PennDraw. Both edges adjacent to Point p are drawn in a different color (if p is null, none of the edges should be in a different color). Use the drawTo(Point q) method of a Point to draw a line from it to q.

The image below shows what our reference draws when we call tour.draw(a). tour is the four-point tour you created for testing.

draw square tour

Required Testing: Add code to main to test each of these methods on an empty tour, a tour containing only one point, a tour containing two points, and a tour containing four points. We encourage you to include additional tests as well. When you are certain all of your own tests work, submit and make sure our tests pass as well.

As you debug your code, you may find this Java execution visualizer helpful. (It was created by daveagp.)

2. Tour Insertion Heuristics

Implement the insertNearest() and insertSmallest() methods.

A. Testing with VisualizeTour

The client VisualizeTour program included in data.zip provides a user interface for you to test the methods you have written in Tour. Run it with a filename argument to animate the construction of your Tour. In the table at the bottom of this page, we have listed the values of size() and distance() that your methods should obtain for each insert method, as well as the PennDraw output that draw() should give.

Required Testing: Check that your in-order insertion method works for at least the input files tsp0.txt, tsp1.txt, tsp2.txt, tsp3.txt, tsp4.txt, tsp5.txt, tsp8.txt, tsp10.txt, and tsp100.txt. Both the drawing itself, and the size and distance, need to match. Do not continue until insertInOrder works for all these cases!

B. insertNearest()

insertNearest(Point p) adds the Point p to the Tour after the closest Point already in the Tour.

If there are multiple closest Points with equal distances to p, insert p after the first such Point in the linked list.

Your method must behave as insertInOrder() does when the linked list is empty.

If you wrote a helper function in the previous section that inserts a given Point after a given Node, you may find it useful again here.

Required Testing: Make sure your VisualizeTour results match the figures below for the Nearest-Neighbor Heuristic for all test cases through tsp100.txt. Both the drawing itself, and the size and distance, need to match. Then submit and make sure it passes our submission tests as well.

C. insertSmallest()

insertSmallest(Point p) adds the Point p to the Tour in the position where it would cause the smallest increase in the Tour's distance.

Do not compute the entire Tour distance for each position of p. Instead, compute the incremental distance: the change in distance from adding p between Points s and t is the sum of the distances from s to p and from p to t, less the original distance from s to t.

If there are multiple positions for p that cause the same minimal increase in distance, insert p in the first such position.

Your method must behave as insertInOrder() does when the linked list is empty.

If you wrote a helper function when writing insertInOrder() that inserts a given Point after a given Node, you may find it useful again here.

Comment out all print statements in Tour before running VisualizeTour on a file of more than 100 Points. Otherwise, you will be waiting for a long time for VisualizeTour to finish.

Required Testing: Make sure your VisualizeTour results match the figures below for all test cases through tsp100.txt. Both the drawing itself, and the size and distance, need to match. Then submit and make sure it passes our submission tests as well.

D. Reference Output

Test your nearest-neighbor heuristic and smallest-increase heuristic methods using VisualizeTour. The following are the values and PennDraw output that your Tour methods should give for each of the provided input files.

FileIn-Order Insertion ('o') Nearest-Neighbor Heuristic ('n')Smallest-Increase Heuristic ('s')
tsp0.txt Size: 0
Distance: 0.0000
tsp0.txt order insertion
Size: 0
Distance: 0.0000
tsp0.txt nearest insertion
Size: 0
Distance: 0.0000
tsp0.txt smallest insertion
tsp1.txt Size: 1
Distance: 0.0000
tsp1.txt order insertion
Size: 1
Distance: 0.0000
tsp1.txt nearest insertion
Size: 1
Distance: 0.0000
tsp1.txt smallest insertion
tsp2.txt Size: 2
Distance: 632.46
tsp2.txt order insertion
Size: 2
Distance: 632.46
tsp2.txt nearest insertion
Size: 2
Distance: 632.46
tsp2.txt smallest insertion
tsp3.txt Size: 3
Distance: 832.46
tsp3.txt order insertion
Size: 3
Distance: 832.46
tsp3.txt nearest insertion
Size: 3
Distance: 832.46
tsp3.txt smallest insertion
tsp4.txt Size: 4
Distance: 963.44
tsp4.txt order insertion
Size: 4
Distance: 956.06
tsp4.txt nearest insertion
Size: 4
Distance: 839.83
tsp4.txt smallest insertion
tsp5.txt Size: 5
Distance: 2595.1
tsp5.txt order insertion
Size: 5
Distance: 2595.1
tsp5.txt nearest insertion
Size: 5
Distance: 1872.8
tsp5.txt smallest insertion
tsp8.txt Size: 8
Distance: 3898.9
tsp8.txt order insertion
Size: 8
Distance: 3378.8
tsp8.txt nearest insertion
Size: 8
Distance: 2545.6
tsp8.txt smallest insertion
tsp10.txt Size: 10
Distance: 2586.7
tsp10.txt order insertion
Size: 10
Distance: 1566.1
tsp10.txt nearest insertion
Size: 10
Distance: 1655.7
tsp10.txt smallest insertion
tsp100.txt Size: 100
Distance: 25547
tsp100.txt order insertion
Size: 100
Distance: 7389.9
tsp100.txt nearest insertion
Size: 100
Distance: 4887.2
tsp100.txt smallest insertion
tsp1000.txt Size: 1000
Distance: 3.2769e+05
tsp1000.txt order insertion
Size: 1000
Distance: 27869
tsp1000.txt nearest insertion
Size: 1000
Distance: 17266
tsp1000.txt smallest insertion
bier127.txt Size: 127
Distance: 21743
bier127.txt order insertion
Size: 127
Distance: 6494.0
bier127.txt nearest insertion
Size: 127
Distance: 4536.8
bier127.txt smallest insertion
circuit1290.txt Size: 1290
Distance: 4.3033e+05
circuit1290.txt order insertion
Size: 1290
Distance: 25030
circuit1290.txt nearest insertion
Size: 1290
Distance: 14596
circuit1290.txt smallest insertion
germany15112.txt Size: 15112
Distance: 4.2116e+06
germany15112.txt order insertion
Size: 15112
Distance: 93119
germany15112.txt nearest insertion
Size: 15112
Distance: 55754
germany15112.txt smallest insertion
mona-20k.txt Size: 20000
Distance: 4.9650e+06
mona-20k.txt order insertion
Size: 20000
Distance: 94894
mona-20k.txt nearest insertion
Size: 20000
Distance: 56334
mona-20k.txt smallest insertion
mona-50k.txt Size: 50000
Distance: 1.2366e+07
mona-50k.txt order insertion
Size: 50000
Distance: 1.6168e+05
mona-50k.txt nearest insertion
Size: 50000
Distance: 95598
mona-50k.txt smallest insertion
mona-100k.txt Size: 100001
Distance: 2.4795e+07
mona-100k.txt order insertion
Size: 100001
Distance: 2.6272e+05
mona-100k.txt nearest insertion
Size: 100001
Distance: 1.5472e+05
mona-100k.txt smallest insertion
usa13509.txt Size: 13509
Distance: 3.9108e+06
usa13509.txt order insertion
Size: 13509
Distance: 77450
usa13509.txt nearest insertion
Size: 13509
Distance: 45075
usa13509.txt smallest insertion

3. Extra Credit

A. TourEC

For extra credit, implement a better heuristic in a class TourEC that implements the TourECInterface interface. You are not required to use the Tour or Point classes for your extra credit solution. If you use a modified version of these classes to implement TourEC, include them in your extra.zip; otherwise, your TA may be unable to compile your code.

Be warned that this is a relatively difficult extra credit, although it gives an opportunity to learn a great deal about an extremely important problem. We will award a special prize to the student whose TourEC finds the shortest tour around the 1,000-point set in tsp1000.txt within about five minutes of runtime.

Here are some heuristics you may choose to implement.

Farthest insertion The farthest insertion heuristic is just like the smallest increase insertion heuristic described in the assignment, except that the Points need not be inserted in the same order as the input. Start with a Tour consisting of the two Points that are farthest apart. Repeat the following:

  • Among all Points not in the Tour, choose the one that is farthest from any Point already in the Tour.
  • Insert that Point into the Tour in the position where it causes the smallest increase in the distance.

You will have to store all of the unused Points in an appropriate data structure, until they get inserted into the Tour. If your code takes a long time, your algorithm probably performs approximately n3 steps. If you're careful and clever, this can be improved to n2 steps.

Node interchange local search Run the original greedy heuristic (or any other heuristic). Then, repeat the following:

  • Choose a pair of Points.
  • Swap the two Points in if this improves the Tour. For example if the original greedy heuristic returns 1-5-6-2-3-4-1, you might consider swapping 5 and 3 if the Tour 1-3-6-2-5-4-1 has a smaller distance.

Writing a function to swap two nodes in a linked list provides great practice with coding linked lists. Be careful, it can be a little trickier that you might first expect (e.g., make sure your code handles the case when the two Points occur consecutively in the original Tour).

Edge interchange local search Run the original greedy heuristic (or any other heuristic). Then, repeat the following:

  • Choose a pair of edges, say 1-2 and 3-4.
  • Replace them with 1-3 and 2-4 if the resulting Tour 1-3-6-2-5-4-1 has a smaller distance.

This requires some care, as you will have to reverse the orientation of the links in the original Tour between Nodes 3 and 2. After performing this heuristic, there will be no crossing edges in the Tour, although it need not be optimal.

B. Enrichment

  • The best known tour for tsp1000.txt is a solution of distance 15476.519, which was found using the Concorde TSP solver.
  • Georgia Tech's TSP site contains a wealth of interesting information including many applications of the TSP and two TSP games.
  • Here's a 13,509 city problem that contains each of the 13,509 cities in the continental US with population over 500. The optimal solution was discovered in 1998 by Applegate, Bixby, Chvatal and Cook using theoretical ideas from linear and integer programming. The following 15,112 city problem was solved to optimality in April, 2001, and is the current world record. It took 585,936,700 CPU seconds (along with a ton of cleverness) to find the optimal tour through 15,112 cities in Germany.
  • Some folks even use the TSP to create and sell art. Check out Bob Bosch's page. You can even make your own TSP artwork.
  • Here is a New York Times article on finding an optimal traveling politician tour in the state of Iowa.
  • Here's a survey article on heuristics for the TSP.

4. Readme and Submission


A. README

Complete readme_tsp.txt in the same way you have done for previous assignments.

B. Submission

Submit Tour.java and readme_tsp.txt on the course website. You should also submit TourEC.java if you did the extra credit.

Before submission remove any print statements that were used for debugging or testing your functions. Any testing done in main() can be kept.

Be sure that every method has an appropriate header comment, and that your code is well-documented.