Algorithms I & II, Princeton University, P8 - Java Language

这是 Algorithms I & II, Princeton 系列笔记的 P8 部分。零基础接触 Java,就从这门课开始吧!这篇笔记将会记录课程中覆盖到的关于 Java 的各类语言知识。

Java 的语法和 C++ 还是挺像的,想要入门并不困难。因此这里将主要记录一些不一样的东西......!


  This article is a self-administered course note.

  It will NOT cover any exam or assignment related content.


Generics

Suppose we've implemented a data structure StackOfStrings, what could we do if the clients also want StackOfURLs, StackOfInts, StackOfVans...?

  • Attempt 1. Implement a seperate stack class for each type. (\(\times\))

  • Attempt 2. Implement a stack with items of type Object. See codes below:

    1
    2
    3
    4
    5
    6
    StackOfObjects s = new StackOfObjects();
    Apple a = new Apple();
    Orange b = new Orange();
    s.push(a);
    s.push(b);
    a = (Apple) (s.pop()); // error-prone: this will cause run-time error!

    The type Object could be seen as the base class in C++. Apple and Orange are derived classes. However, this requires casting in client, which may cause run-time error if types mismatch.

  • Attempt 3. Applying Java generics. See codes below:

    1
    2
    3
    4
    5
    6
    Stack<Apple> s = new Stack<Apple>();
    Apple a = new Apple();
    Orange b = new Orange();
    s.push(a);
    s.push(b); // compile-time error happens!
    a = s.pop();

    Using Java generics avoid casting in client and could discover type dismatch errors at compile-time instead of run-time.

Guiding principles. Welcome compile-time errors; avoid run-time errors.

Note that Java does NOT support generic array. To achieve this, proper casting is needed:

1
2
3
4
5
private Item[] s;
...
s = new Item[capacity]; // Wrong! Java doesn't support generic array
s = (Item[]) new Object[capacity]; // Acceptable. the "ugly" cast
// Even with casting, Java will warn you for "unchecked cast"

Q. What to do about primitive types?

To create a generic stack with primitive types like int, we use the corresponding wrapper types. Note that Java supports autoboxing which allows automatic cast between a primitive type and its wrapper.

1
2
3
Stack<Integer> s = new Stack<Integer>();
s.push(17); // s.push(Integer.valueOf(17));
int a = s.pop(); // int a = s.pop().IntValue();


学到这里,我发现 Java 的泛型 (generics) 与 C++ 中的类模板 (templates) 很相似,就连定义的语法都同样采用 diamond operator <>。上网搜索发现,Java 的泛型 (即 attempt 3) 本质上是上文中 attempt 2 的语法糖。这也是为什么 Java 的泛型又被称作类型参数化 (type parameterization)

具体来说,在 Java 中,两个泛型栈实例 Stack<Apple>Stack<Orange> 将共享 Stack 类的静态变量;而在 C++ 中则不会,这两个模板栈属于完全不同的类。


Iterator Interface

By implementing the java.lang.Iterable interface, we could support iteration over data structure items by clients, without revealing the internal representation of the data structure.

There are two interfaces that we need to implement in our data structure:

  • Iterable : Has a method that returns an Iterator.

    1
    2
    3
    public interface Iterable<Item> {
    Iterator<Item> iterator();
    }
  • Iterator: Has methods hasNext() and next().

    1
    2
    3
    4
    5
    public interface Iterator<Item> {
    boolean hasNext();
    Item next();
    // optional: void remove();
    }

If the above two interfaces are properly implemented, Java supports elegant client code ("foreach" statement) for interating items in your data structures.

1
2
3
4
5
6
7
8
9
// foreach statement code (shorthand):
for (String s : stack) StdOut.println(s);

// equivalent code (longhand):
Iterator<String> i = stack.iterator();
while (i.hasNext()) {
String s = i.next();
StdOut.println(s);
}

Here is a sample code on how to implement a reverse iterator for an array-based stack.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import java.util.Iterator;

public class Stack<Item> implements Iterable<Item> {
...
public Iterator<Item> iterator() {
return new ReverseArrayIterator();
}
private class ReverseArrayIterator implements Iterator<Item> {
private int i = N;

public boolean hasNext() { return i > 0; }
public Item next() { return s[--i]; }
}
}


Comparable Interface

Implementing the Comparable interface allows us to sort any type of data.

For a simple sort client:

1
2
3
4
5
6
7
8
9
10
public class Student {
...
public static void main(String[] args) {
int n = Integer.parseInt(args[0]);
Student[] stu = new Student[n];
Insertion.sort(stu);
for (int i = 0; i < n; ++i)
StdOut.println(stu[i].getName());
}
}

We need to implement the Comparable interface in class Student as a callback.

1
2
3
public interface Comparable<Item> {
public int compareTo(Item that);
}

In particular, the object implementation requires to implement the compareTo method.

Note that the compareTo method should follow the total order.

1
2
3
4
5
6
7
8
9
10
public class Student implements Comparable<Student> {
...
public int compareTo(Student that) {
if (this.grades < that.grades)
return -1;
if (this.grades > that.grades)
return +1;
return 0;
}
}

A total order is a binary relation \(\leq\) that satisfies:

  • Antisymmetry: if \(v\leq w\) and \(w\leq v\), then \(v=w\).
  • Transitivity: if \(v\leq w\) and \(w\leq x\), then \(v\leq x\).
  • Totality: either \(v\leq w\) or \(w\leq v\) or both.

Finally, we implement the sorting algorithm. The key point is that there should be NO dependence on Student data type.

For convenience, two useful sorting abstractions less and exch are introduced.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public class Insertion {
public static void sort(Comparable[] a) {
int n = a.length;
for (int i = 0; i < N; ++i)
for (int j = i; j > 0; --j) {
if (a[j].compareTo(a[j - 1]) < 0)
exch(a, j, j - 1);
else
break;
}
}

private static boolean less(Comparable v, Comparable w) {
return v.compareTo(w) < 0;
}

private static void exch(Comparable[] a, int i, int j) {
Comparable swap = a[i];
a[i] = a[j];
a[j] = swap;
}

private static boolean isSorted(Comparable[] a) {
for (int i = 1; i < a.length; ++i)
if (less(a[i], a[i - 1])) return false;
return true;
}

}


Assertion Statement

Assertion is the statement to test assumptions about the program.

  • Helps detect logic bugs.
  • Documents code.

Java assert statement throws an exception unless boolean condition is true.

1
2
assert isSorted(a, lo, hi);
// if a[lo]...a[hi] is not sorted, throws an exception

Java assert statement could be enabled or disabled at runtime. \(\Rightarrow\) No cost in production code.

1
2
java -ea MyProgram    // enable assertions
java -da MyProgram // disable assertions (default)

Note that assertions are better used to check internal invariants; do NOT use for external argument checking.


Comparator Interface

We have introduced Comparable interface in the previous section; It is usually implemented with a type's natural order.

However, when we want to sort using one or more alternate orders (must be a total order), we need to implement the Comparator interface.

natural order and alternate orders of letter-string

See the following Java system sort clients:

1
2
3
4
5
6
7
String[] a;
...
Arrays.sort(a); // uses natural order defined in Comparable interface
...
Arrays.sort(a, String.CASE_INSENCITIVE_ORDER);
Arrays.sort(a, Collator.getInstance(new Locale("es")));
Arrays.sort(a, new BritshPhoneBookOrder());

Note that the second argument passed into Arrays.sort() is a Comparator object.

To create such objects, we need to implement Comparator interface in the corresponding class:

  • Define a (nested) class that implements the Comparator interface.
  • Implement the compare() method.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public class Student {
public static final Comparator<Student> BY_NAME = new ByName();
public static final Comparator<Student> BY_SECTION = new BySection();
private final String name;
private final int section;
...

private static class ByName implements Comparator<Student> {
public int compare(Student v, Student w) {
return v.name.compareTo(w.name);
}
}

private static class BySection implements Comparator<Student> {
public int compare(Student v, Student w) {
return v.section - w.section;
}
}
}

If we want to support comparators in our own sort implementations:

  • Use Object instead of Comparable.
  • Pass Comparator to sort() and less() and use it in less().

Here is a sample code of insertion sort using a Comparator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public static void sort(Object[] a, Comparator comparator) {
int N = a.length;
for (int i = 0; i < N; ++i)
for (int j = i; j > 0 && less(comparator, a[j], a[j - 1]); --j)
exch(a, j, j - 1);
}

private static boolean less(Comparator c, Object v, Object w) {
return c.compare(v, w) < 0; // compare() is used here in less()
}

private static void exch(Object[] a, int i, int j) { // no need to pass comparator in
Object swap = a[i]; a[i] = a[j]; a[j] = swap;
}


Immutability

Classes should be immutable unless there's a very good reason to make them mutable.

—— Joshua Bloch (Java architect)

If we make a data type immutable, it means we can't change the data type value once created.

To achieve immutability in Java, we need to:

  • final in class definition: make sure the class can't override instance methods.
  • final in instance variables: make all instance variables private and final.
  • make defensive copy of mutable instance variables in constructor.
  • make sure all instance methods don't change instance variables.
1
2
3
4
5
6
7
8
9
10
11
12
public final class Vector {
private final int N;
private final double[] data;

public Vector(double[] data) {
this.N = data.length;
for (int i = 0; i < N; ++i)
this.data[i] = data[i]; // make defensive copy
}

... // instance methods don't change instance variables
}

Advantages of immutability.

  • Simplifies debugging.
  • Safer in presence of hostile code.
  • Simplifies concurrent programming.


Equality Test

All Java classes inherit a method equals().

For a user-defined type, it is typically unsafe to use equals() with inheritance as it may violate symmetry. So we need to overwrite equals() as a callback.

Java requirements. For any references x, y and z, the below equivalence relation must hold.

  • Reflective. x.equals(x) is true.
  • Symmetric. x.equals(y) iff y.equals(x).
  • Transitive. if x.equals(y) and y.equals(z), then x.equals(z).
  • Non-null. x.equals(null) is false.

Standard recipe of equals design for user-designed types is as follows:

  • Optimization for reference equality. (==)
  • Check against null.
  • Check whether two objects are of the same type and cast.
  • Compare each significant field.
    • if field is a primitive type, use ==.
    • if field is an object, use equals(). (thus apply rule recursively)
    • if field is an array, apply to each entry. (lazy approaches: use Arrays.equals(a,b) or Arrays.deepEquals(a,b))

Best practices of equals design to improve efficiency.

  • No need to use calculated fields that depent on other fields.
  • Compare fields mostly likely to differ first.
  • Make compareTo() consistent with equals(). (x.equals(y) iff (x.compareTo(y) == 0))

See an example equals design for the class Date.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public final class Date implements Comparable<Date> {
private final int month;
private final int day;
private final int year;

public boolean equals(Object y) { // Note that argument type should be Object
if (y == this) return true; // optimize for reference equality
if (y == null) return false; // check for null
if (y.getClass() != this.getClass()) // objects must be in the same class
return false;
Date that = (Date) y; // cast
if (this.day != that.day ) return false;
if (this.month != that.month) return false;
if (this.year != that.year ) return false;
return true;
}
}


Hash Function

All Java classes inherit a method hashCode(), which returns a 32-bit int.

  • Requirement. If x.equals(y), then (x.hashCode() == y.hashCode()).
  • Highly desirable. If !x.equals(y), then (x.hashCode() != y.hashCode()).

Horner's method. Use \(L\) multiplies/adds to hash string of length \(L\).

Horner's method usually uses gen 31 to hash string of length \(L\): \(h=s_031^{L-1}+...s_{L-2}31^1+s_{L-1}31^0\).

1
2
3
4
5
6
7
8
9
10
11
12
13
// Java library implementation
public final class String {
private int hash = 0; // cache of hash code
private final char[] s;
...
public int hashCode() {
int h = hash;
if (h != 0) return h;
for (int i = 0; i < length(); ++i) hash = s[i] + (31 * hash);
hash = h;
return h;
}
}

Standard recipe of hash code design for user-defined types is as follows:

  • Combine each significant field using the \(31x+y\) rule.
  • If field is a primitive type, use wrapper type hashCode().
  • If field is null, return 0.
  • If field is a reference type, use hashCode(). (thus applies rule recursively)
  • If field is an array, apply to each entry. (lazy approach: use Arrays.deepHashCode())

This recipe works resonably well in practice and is widely used in Java libraries.


String in Java

Java 中的字符串类型主要有 String, StringBuilder 两种,实现的不同决定了其应用场景的不同。

String 类:

  • Sequence of immutable characters.
  • Underlying implementation. Immutable char[] array, offset, and length.

StringBuilder 类:

  • Sequence of mutable characters.
  • Underlying implementation. Resizing char[] array and length.
常见 string operations 的效率对比

immutability 上的差异造成了这两种字符串实现在 concat()substring() 操作上的表现差异。

值得注意的是 substring() 函数:String 类的 immutability 允许它使用 copy of reference to original char array,因此创建一个 substring 仅仅需要 constant time。

除了以上提到的 built-in functions 之外,一些其他常见的字符串操作也会因为字符串实现的不同而产生效率上的差异。例如:

  • 字符串翻转函数 rev()String 类需要 quadratic time:每一次添加新字符都需要 linear time。而 StringBuilder 类只需要 linear time:使用 append() 函数,每一次添加新字符仅需 constant time。
  • Form array of suffixes。String 类需要 linear time & space:如上所述,每次创建一个 substring 仅需要 constant time。而 StringBuilder 类需要 quadratic time & space。


Java Language Style

这门课的 projects 设计实在是太良心了,在评分反馈里居然还包含 style check!难以想象教职员们为课程的准备倾注了多少心血……至少我从来没有见过有一门课的评分标准里还包含 coding style。

  1. The instance variable name must start with a lowercase letter and use camelCase.
  2. The class name must start with a uppercase letter and use CamelCase.
  3. Remove unused import statements.
  4. Typecast (类型转换) must be followed by a whitespace.
  5. Define constructors after static and instance variables but before methods.
  6. Define constant variables to hold the constants used more than once.
  7. If a private instance (or static) variable is initialized only in the declaration or constructor, then you should make it final.
  8. Declare static variables before instance variables, constructors, and methods.
  9. It is bad design (and a violation of the API) for the success of calling one method to depend on the client previously calling another method.


Reference

  This article is a self-administered course note.

  References in the article are from corresponding course materials if not specified.

Course info:

Algorithms, Part 1, Princeton University. Lecturer: Robert Sedgewick & Kevin Wayne.

Course textbook:

Algorithms (4th edition), Sedgewick, R. & Wayne, K., (2011), Addison-Wesley Professional, view


-----------------------------------そして、次の曲が始まるのです。-----------------------------------