CSc 471 Spring 2022 -Assignment 3

Computer Science – The City College of New York
Computer Vision
Assignment 3 ( Deadline: 03/27 Sunday before midnight )

(Those marked with * are optional for extra credits)

Note:  Turn in a PDF document containing a list  of your .m files (not the code itself),  images showing the results of your experiments such as images, tables, plots, and an analysis of the results. All the writings must be soft copies in print and be sent to Prof. Zhu via email Zhigang Zhu <>.  For the programming part, send ONLY your source code  by email; please don’t send in your images and executable (even if you use C++).  You are responsible for the loss of your submissions if you don’t write  “CSC 471 Computer Vision Assignment 3” in the subject of your email. Do write your name and ID (last four digits) in both both of your report and the code. Please don’t zip your report with your code and other files; send me the report in a separate PDF file. The rest can be in a zipped file.

1   (Camera Models- 30 points)  Prove that the vector from the viewpoint of a pinhole camera to the vanishing point (which is a point on the image plane) of a set of 3D parallel lines in space is parallel to the direction of that set parallel lines. Please show steps of your proof.

Hint: You can either use geometric reasoning or algebraic calculation. 

If you choose to use geometric reasoning, you can use the fact that the projection of a 3D line in space is the intersection of its “interpretation plane” with the image plane.  Here the interpretation plane (IP) is a plane passing through the 3D line and the center of projection (viewpoint) of the camera.  Also, the interpretation planes of two parallel lines intersect in a line passing through the viewpoint, and the intersection line is parallel to the parallel lines.

If you select to use algebraic calculation, you may use the parametric representation of a 3D line: P = P0 +tV, where P= (X,Y,Z)T is any point on the line (here  T denote for transpose),   P0 = (X0,Y0,Z0)T is a given fixed point on the line, vector V = (a,b,c)T represents the direction of the line, and t is the scalar parameter that controls the distance (with sign) between P and P0.

2. (Camera Models- 20 points) Show that relation between any image point (xim, yim)T  (in the form of (x1,x2,x3)T in projective space ) of a planar surface in 3D space and its corresponding point (Xw, Yw, Zw)T on the plane in 3D space can be represented by a 3×3 matrix. You should start from the general form of the camera model (x1,x2,x3)T = MintMext(Xw, Yw, Zw, 1)T, where the image center (ox, oy), the focal length f, the scaling factors( sx and sy),  the rotation matrix R and the translation vector T are all unknown. Note that in the course slides and the lecture notes, I used a simplified model of the perspective project by assuming ox and oy are known and sx = sy =1, and only discussed the special cases of a plane. So you cannot directly copy those equations I used.  Instead you should use the general form of the projective matrix, and the  general form of a plane nx Xw + ny Yw + nz Zw  = d. 

3.  (Calibration- 50 points )  Prove the Orthocenter Theorem by geometric arguments: Let T be the triangle on the image plane defined by the three vanishing points of three mutually orthogonal sets of parallel lines in space. Then the image center is the orthocenter of the triangle T (i.e., the common intersection of the three altitudes.  Note that you are asked to prove the Orthocenter Theorem rather than that the orthocenter itself as the common interaction of the three altitudes, which you can use as a fact.
(1)    Basic proof: use the result of Question 1, assuming the aspect ratio of the camera is 1. (20 points)
(2)    If you do not know the  focal length of the camera, can you still find the image center using the Orthocenter Theorem? Can you further estimate the focal length? For both questions, please show why (and then how) or why not. (20 points)
(3)    If you do not know the aspect ratio of the camera, can you still find the image center using the Orthocenter Theorem? Show why or why not. (10 points)