Algorithms I & II, Princeton University, P8 - Java Language
这是 Algorithms I & II, Princeton 系列笔记的 P8 部分。零基础接触 Java,就从这门课开始吧!这篇笔记将会记录课程中覆盖到的关于 Java 的各类语言知识。
Java 的语法和 C++ 还是挺像的,想要入门并不困难。因此这里将主要记录一些不一样的东西......!
This article is a self-administered course note.
It will NOT cover any exam or assignment related content.
Generics
Suppose we've implemented a data structure
StackOfStrings
, what could we do if the clients also want
StackOfURLs
, StackOfInts
,
StackOfVans
...?
Attempt 1. Implement a seperate stack class for each type. (\(\times\))
Attempt 2. Implement a stack with items of type
Object
. See codes below:1
2
3
4
5
6StackOfObjects s = new StackOfObjects();
Apple a = new Apple();
Orange b = new Orange();
s.push(a);
s.push(b);
a = (Apple) (s.pop()); // error-prone: this will cause run-time error!The type
Object
could be seen as the base class in C++.Apple
andOrange
are derived classes. However, this requires casting in client, which may cause run-time error if types mismatch.Attempt 3. Applying Java generics. See codes below:
1
2
3
4
5
6Stack<Apple> s = new Stack<Apple>();
Apple a = new Apple();
Orange b = new Orange();
s.push(a);
s.push(b); // compile-time error happens!
a = s.pop();Using Java generics avoid casting in client and could discover type dismatch errors at compile-time instead of run-time.
Note that Java does NOT support generic array. To achieve this, proper casting is needed:
1 | private Item[] s; |
Q. What to do about primitive types?
To create a generic stack with primitive types like int
,
we use the corresponding wrapper types. Note that Java
supports autoboxing which allows automatic
cast between a primitive type and its wrapper.
1 | Stack<Integer> s = new Stack<Integer>(); |
学到这里,我发现 Java 的泛型 (generics) 与 C++ 中的类模板 (templates) 很相似,就连定义的语法都同样采用 diamond operator
<>
。上网搜索发现,Java 的泛型 (即 attempt 3) 本质上是上文中 attempt 2 的语法糖。这也是为什么 Java 的泛型又被称作类型参数化 (type parameterization)。具体来说,在 Java 中,两个泛型栈实例
Stack<Apple>
与Stack<Orange>
将共享Stack
类的静态变量;而在 C++ 中则不会,这两个模板栈属于完全不同的类。
Iterator Interface
By implementing the java.lang.Iterable
interface, we
could support iteration over data structure items by
clients, without revealing the internal representation of the data
structure.
There are two interfaces that we need to implement in our data structure:
Iterable
: Has a method that returns anIterator
.1
2
3public interface Iterable<Item> {
Iterator<Item> iterator();
}Iterator
: Has methodshasNext()
andnext()
.1
2
3
4
5public interface Iterator<Item> {
boolean hasNext();
Item next();
// optional: void remove();
}
If the above two interfaces are properly implemented, Java supports elegant client code ("foreach" statement) for interating items in your data structures.
1 | // foreach statement code (shorthand): |
Here is a sample code on how to implement a reverse iterator for an array-based stack.
1 | import java.util.Iterator; |
Comparable Interface
Implementing the Comparable
interface allows us to sort
any type of data.
For a simple sort client:
1 | public class Student { |
We need to implement the Comparable
interface in class
Student
as a callback.
1 | public interface Comparable<Item> { |
In particular, the object implementation requires to implement the
compareTo
method.
Note that the compareTo
method should follow the
total order.
1 | public class Student implements Comparable<Student> { |
Finally, we implement the sorting algorithm. The key point is that
there should be NO dependence on Student
data type.
For convenience, two useful sorting abstractions less
and exch
are introduced.
1 | public class Insertion { |
Assertion Statement
Assertion is the statement to test assumptions about the program.
- Helps detect logic bugs.
- Documents code.
Java assert statement throws an exception unless boolean condition is true.
1 | assert isSorted(a, lo, hi); |
Java assert statement could be enabled or disabled at runtime. \(\Rightarrow\) No cost in production code.
1 | java -ea MyProgram // enable assertions |
Note that assertions are better used to check internal invariants; do NOT use for external argument checking.
Comparator Interface
We have introduced Comparable
interface in the previous
section; It is usually implemented with a type's natural
order.
However, when we want to sort using one or more alternate
orders (must be a total order), we need to
implement the Comparator
interface.
See the following Java system sort clients:
1 | String[] a; |
Note that the second argument passed into Arrays.sort()
is a Comparator
object.
To create such objects, we need to implement Comparator
interface in the corresponding class:
- Define a (nested) class that implements the
Comparator
interface. - Implement the
compare()
method.
1 | public class Student { |
If we want to support comparators in our own sort implementations:
- Use
Object
instead ofComparable
. - Pass
Comparator
tosort()
andless()
and use it inless()
.
Here is a sample code of insertion sort using a Comparator.
1 | public static void sort(Object[] a, Comparator comparator) { |
Immutability
Classes should be immutable unless there's a very good reason to make them mutable.
—— Joshua Bloch (Java architect)
If we make a data type immutable, it means we can't change the data type value once created.
To achieve immutability in Java, we need to:
final
in class definition: make sure the class can't override instance methods.final
in instance variables: make all instance variables private and final.- make defensive copy of mutable instance variables in constructor.
- make sure all instance methods don't change instance variables.
1 | public final class Vector { |
Advantages of immutability.
- Simplifies debugging.
- Safer in presence of hostile code.
- Simplifies concurrent programming.
Equality Test
All Java classes inherit a method equals()
.
For a user-defined type, it is typically unsafe to use
equals()
with inheritance as it may violate symmetry. So we
need to overwrite equals()
as a callback.
Standard recipe of equals design for user-designed types is as follows:
- Optimization for reference equality. (
==
) - Check against
null
. - Check whether two objects are of the same type and cast.
- Compare each significant field.
- if field is a primitive type, use
==
. - if field is an object, use
equals()
. (thus apply rule recursively) - if field is an array, apply to each entry. (lazy approaches: use
Arrays.equals(a,b)
orArrays.deepEquals(a,b)
)
- if field is a primitive type, use
Best practices of equals design to improve efficiency.
- No need to use calculated fields that depent on other fields.
- Compare fields mostly likely to differ first.
- Make
compareTo()
consistent withequals()
. (x.equals(y)
iff(x.compareTo(y) == 0)
)
See an example equals design for the class Date
.
1 | public final class Date implements Comparable<Date> { |
Hash Function
All Java classes inherit a method hashCode()
, which
returns a 32-bit int
.
- Requirement. If
x.equals(y)
, then(x.hashCode() == y.hashCode())
. - Highly desirable. If
!x.equals(y)
, then(x.hashCode() != y.hashCode())
.
Horner's method usually uses gen 31 to hash string of length \(L\): \(h=s_031^{L-1}+...s_{L-2}31^1+s_{L-1}31^0\).
1 | // Java library implementation |
Standard recipe of hash code design for user-defined types is as follows:
- Combine each significant field using the \(31x+y\) rule.
- If field is a primitive type, use wrapper type
hashCode()
. - If field is null, return 0.
- If field is a reference type, use
hashCode()
. (thus applies rule recursively) - If field is an array, apply to each entry. (lazy approach: use
Arrays.deepHashCode()
)
This recipe works resonably well in practice and is widely used in Java libraries.
String in Java
Java 中的字符串类型主要有 String
,
StringBuilder
两种,实现的不同决定了其应用场景的不同。
String
类:
- Sequence of immutable characters.
- Underlying implementation. Immutable
char[]
array, offset, and length.
StringBuilder
类:
- Sequence of mutable characters.
- Underlying implementation. Resizing
char[]
array and length.
immutability 上的差异造成了这两种字符串实现在 concat()
与 substring()
操作上的表现差异。
值得注意的是 substring()
函数:String
类的
immutability 允许它使用 copy of reference to original char
array,因此创建一个 substring 仅仅需要 constant time。
除了以上提到的 built-in functions 之外,一些其他常见的字符串操作也会因为字符串实现的不同而产生效率上的差异。例如:
- 字符串翻转函数
rev()
。String
类需要 quadratic time:每一次添加新字符都需要 linear time。而StringBuilder
类只需要 linear time:使用append()
函数,每一次添加新字符仅需 constant time。 - Form array of suffixes。
String
类需要 linear time & space:如上所述,每次创建一个 substring 仅需要 constant time。而StringBuilder
类需要 quadratic time & space。
Java Language Style
这门课的 projects 设计实在是太良心了,在评分反馈里居然还包含 style check!难以想象教职员们为课程的准备倾注了多少心血……至少我从来没有见过有一门课的评分标准里还包含 coding style。
- The instance variable name must start with a lowercase letter and use camelCase.
- The class name must start with a uppercase letter and use CamelCase.
- Remove unused import statements.
- Typecast (类型转换) must be followed by a whitespace.
- Define constructors after static and instance variables but before methods.
- Define constant variables to hold the constants used more than once.
- If a private instance (or static) variable is initialized only in the declaration or constructor, then you should make it final.
- Declare static variables before instance variables, constructors, and methods.
- It is bad design (and a violation of the API) for the success of calling one method to depend on the client previously calling another method.
Reference
This article is a self-administered course note.
References in the article are from corresponding course materials if not specified.
Course info:
Algorithms, Part 1, Princeton University. Lecturer: Robert Sedgewick & Kevin Wayne.
Course textbook:
Algorithms (4th edition), Sedgewick, R. & Wayne, K., (2011), Addison-Wesley Professional, view