CSC420 Image Understanding

Intro to Image Understanding (CSC420)
Assignment 1
Due Date: Jan 26th, 2024, 11:59:00 pm
Total: 120 marks
General Instructions:
? You are allowed to work directly with one other person to discuss the questions. However, you are still expected to write the solutions/code/report in your own words; i.e.
no copying. If you choose to work with someone else, you must indicate this in your
assignment submission. For example, on the first line of your report file (after your
own name and information, and before starting your answer to Q1), you should have
a sentence that says: “In solving the questions in this assignment, I worked together
with my classmate [name & student number]. I confirm that I have written the solutions/code/report in my own words”.
? Your submission should be in the form of an electronic report (PDF), with the answers
to the specific questions (each question separately), and a presentation and discussion
of your results. For this, please submit a file named report.pdf to MarkUs directly.
? Submit documented codes that you have written to generate your results separately.
Please store all of those files in a folder called assignment1, zip the folder and then
submit the file assignment1.zip to MarkUs. You should include a README.txt
file (inside the folder) which details how to run the submitted codes.
? Do not worry if you realize you made a mistake after submitting your zip file; you can
submit multiple times on MarkUs.
Part I: Theoretical Problems (60 marks)
[Question 1] Convolution (10 marks)
[1.a] (5 marks) Calculate and plot the convolution of x[n] and h[n] specified below:
x[n] = (
1 ?3 ≤ n ≤ 3
0 otherwise
h[n] = (
1 ?2 ≤ n ≤ 2
0 otherwise
(1)
[1.b] (5 marks) Calculate and plot the convolution of x[n] and h[n] specified below:
x[n] = (
1 ?3 ≤ n ≤ 3
0 otherwise
h[n] = (
2 ? |n| ?2 ≤ n ≤ 2
0 otherwise
(2)
1
[Question 2] LTI Systems (15 marks)
We define a system as something that takes an input signal, e.g. x(n), and produces an
output signal, e.g. y(n). Linear Time-Invariant (LTI) systems are a class of systems that
are both linear and time-invariant. In linear systems, the output for a linear combination of
inputs is equal to the linear combination of individual responses to those inputs. In other
words, for a system T, signals x1(n) and x2(n), and scalars a1 and a2, system T is linear if
and only if:
T[a1x1(n) + a2x2(n)] = a1T[x1(n)] + a2T[x2(n)]
Also, a system is time-invariant if a shift in its input merely shifts the output; i.e. If T[x(n)] =
y(n), system T is time-invariant if and only if:
T[x(n ? n0)] = y(n ? n0)
[2.a] (5 marks) Consider a discrete linear time-invariant system T with discrete input signal
x(n) and impulse response h(n). Recall that the impulse response of a discrete system
is defined as the output of the system when the input is an impulse function δ(n), i.e.
T[δ(n)] = h(n), where:
δ(n) = (
1, if n = 0,
0, else.
Prove that T[x(n)] = h(n) ? x(n), where ? denotes convolution operation.
Hint: represent signal x(n) as a function of δ(n).
[2.b] (5 marks) Is Gaussian blurring linear? Is it time-invariant? Make sure to include your
justifications.
[2.c] (5 marks) Is time reversal, i.e. T[x(n)] = x(?n), linear? Is it time-invariant? Make
sure to include your justifications.
[Question 3] Polynomial Multiplication and Convolution (15 marks)
Vectors can be used to represent polynomials. For example, 3rd-degree polynomial (a3x
3 +
a2x
2 + a1x + a0) can by represented by vector [a3, a2, a1, a0].
If u and v are vectors of polynomial coefficients, prove that convolving them is equivalent to
multiplying the two polynomials they each represent.
Hint: You need to assume proper zero-padding to support the full-size convolution.
2
[Question 4] Laplacian Operator (20 marks)
The Laplace operator is a second-order differential operator in the “n”-dimensional Euclidean
space, defined as the divergence (?) of the gradient (?f). Thus if f is a twice-differentiable
real-valued function, then the Laplacian of f is defined by:
where the latter notations derive from formally writing:
Now, consider a 2D image I(x, y) and its Laplacian, given by ?I = Ixx+Iyy. Here the second
partial derivatives are taken with respect to the directions of the variables x, y associated
with the image grid for convenience. Show that the Laplacian is in fact rotation invariant.
In other words, show that ?I = Irr + Ir, where r and r
′ are any two orthogonal directions.
Hint: Start by using polar coordinates to describe a chosen location (x, y). Then use the
chain rule.
Part II: Implementation Tasks (60 marks)
[Question 5] Canny Edge Detector Robustness (10 marks)
Using the sample code provided in Tutorial 2, examine the sensitivity of the Canny edge
detector to Gaussian noise. To do so, take an image of your choice, and add i.i.d Gaussian
noise to each pixel. Analyze the performance of the edge detector as a function of noise variance. Include your observations and three sample outputs (corresponding to low, medium,
and high noise variances) in the report.
[Question 6] Edge Detection (50 marks)
In this question, the goal is to implement a rudimentary edge detection process that uses a
derivative of Gaussian, through a series of steps. For each step (excluding step 1) you are
supposed to test your implementation on the provided image, and also on one image of your
own choice. Include the results in your report.
Step I - Gaussian Blurring (10 marks): Implement a function that returns a 2D Gaussian matrix for input size and scale σ. Please note that you should not use any of the
existing libraries to create the filter, e.g. cv2.getGaussianKernel(). Moreover, visualize this
2D Gaussian matrix for two choices of σ with appropriate filter sizes. For the visualization,
3
you may consider a 2D image with a colormap, or a 3D graph. Make sure to include the
color bar or axis values.
Step II - Gradient Magnitude (10 marks): In the lectures, we discussed how partial
derivatives of an image are computed. We know that the edges in an image are from the
sudden changes of intensity and one way to capture that sudden change is to calculate the
gradient magnitude at each pixel. The edge strength or gradient magnitude is defined as:

where gx and gy are the gradients of image f(x, y) along x and y-axis direction respectively.
Using the Sobel operator, gx and gy can be computed as:
Implement a function that receives an image f(x, y) as input and returns its gradient g(x, y)
magnitude as output using the Sobel operator. You are supposed to implement the convolution required for this task from scratch, without using any existing libraries.
Step III - Threshold Algorithm (20 marks): After finding the image gradient, the
next step is to automatically find a threshold value so that edges can be determined. One
algorithm to automatically determine image-dependent threshold is as follows:
1. Let the initial threshold τ0 be equal to the average intensity of gradient image g(x, y),
as defined below:
where h and w are the height and width of the image under consideration.
2. Set iteration index i = 0, and categorize the pixels into two classes, where the lower
class consists of the pixels whose gradient magnitudes are less than τ0, and the upper
class contains the rest of the pixels.
3. Compute the average gradient magnitudes mL and mH of lower and upper classes,
respectively.
4. Set iteration i = i + 1 and update threshold value as:
τi =
mL + mH
2
5. Repeat steps 2 to 4 until |τi ? τi?1| ≤ ? is satisfied, where ? → 0; take τi as final
threshold and denote it by τ .
4
Once the final threshold is obtained, each pixel of gradient image g(x, y) is compared
with τ . The pixels with a gradient higher than τ are considered as edge point and
is represented as white pixel; otherwise, it is designated as black. The edge-mapped
image E(x, y), thus obtained is:
E(x, y) = (
255, if g(x, y) ≥ τ
0, otherwise
Implement the aforementioned threshold algorithm. The input to this algorithm is the gradient image g(x, y) obtained from step II, and the output is a black and white edge-mapped
image E(x, y).
Step IV - Test (10 marks): Use the image provided along with this assignment, and also
one image of your choice to test all the previous steps (I to III) and to visualize your results
in the report. Convert the images to grayscale first. Please note that the input to each step
is the output of the previous step. In a brief paragraph, discuss how the algorithm works for
these two examples and highlight its strengths and/or its weaknesses.
WX：codehelp