OpenCV Point(x,y) represent (column,row) or (row,column)
I have a 300x200 image in a Matrix src
. I am doing the following operation on the image.
for(int i=0;i<src.rows;i++){
for(int j=0;j<src.cols;j++){
line( src, Point(i,j),Point(i,j), Scalar( 255, 0, 0 ), 1,8 );
}
}
imshow("A",src);
waitKey(0);
I was expecting it to cover the entire image in white, but lower portion of the image remain empty. While if I do this
for(int i=0;i<src.rows;i++){
for(int j=0;j<src.cols;j++){
src.at<uchar>(i,j)=255;
}
}
imshow("A",src);
waitKey(0);
Entire image is covered in white. So, this means that src.at<uchar>(i,j)
is using (i,j)
as (row,column) but Point(x,y)
is using (x,y)
as (column,row)
Solution 1:
So, this means that
src.at(i,j)
is using(i,j)
as (row,column) butPoint(x,y)
is using(x,y)
as (column,row)
That is right! Since this seems to confuse many people I'll write my interpretation for the reason:
In OpenCV, cv::Mat
is used for both, images and matrices, since a discrete image is basically the same as a matrix.
In mathematics, we have some different things:
- matrices, which have a number of rows and a number of columns.
- graphs (of functions), which have multiple axes and graphically represent the graph in the form of an image.
- points, which are ordered by the axes of the coordinate system which normally is a cartesian coordinate.
1. For matrices, the mathematical notation is to order in row-major-order which is
Following conventional matrix notation, rows are numbered by the first index of a two-dimensional array and columns by the second index, i.e., a1,2 is the second element of the first row, counting downwards and rightwards. (Note this is the opposite of Cartesian conventions.)
Taken from http://en.wikipedia.org/wiki/Row-major_order#Explanation_and_example
As in mathematics, row:0, column:0 is the top-left element of the matrix. Row/column are just like in tables...
0/0---column--->
|
|
row
|
|
v
2. For Points, a coordinate system is chosen that fulfills two things: 1. it uses the same unit-sizes and the same "origin" as the matrix notation, so top-left is Point(0,0) and axis length 1 means the length of 1 row or 1 column. 2. it uses "image notation" for axis-ordering, which means that abscissa (horizontal axis) is the first value designating the x-direction and the ordinate (vertical axis) is the second value designating the y-direction.
The point where the axes meet is the common origin of the two number lines and is simply called the origin. It is often labeled O and if so then the axes are called Ox and Oy. A plane with x- and y-axes defined is often referred to as the Cartesian plane or xy plane. The value of x is called the x-coordinate or abscissa and the value of y is called the y-coordinate or ordinate.
The choices of letters come from the original convention, which is to use the latter part of the alphabet to indicate unknown values. The first part of the alphabet was used to designate known values.
http://en.wikipedia.org/wiki/Cartesian_coordinate_system#Two_dimensions
so in a perfect world, we would choose the coordinate system of points/images to be:
^
|
|
Y
|
|
0/0---X--->
but since we want to have that origin in top-left and positive values to go to the bottom, it is instead:
0/0---X--->
|
|
Y
|
|
v
So, for image processing people row-first notation might be weird, but for mathematicians x-axis-first would be strange to access a matrix.
So, in OpenCV, you can use: mat.at<type>(row,column)
or mat.at<type>(cv::Point(x,y))
to access the same point if x=column
and y=row
which is perfectly comprehensible =)
Hope this correct. I don't know much about the notations, but that's what my experience in mathematics and imaging tells me.
Solution 2:
I found a quick and fast fix to this problem by just converting the coordinates from opencv to Cartesian coordinates in 4th quadrant, simply by putting a (-)ve sign in front of the y coordinate.
This way, i was able to use my existing algorithms and all the standard Cartesian system equations with opencv without putting much overhead on the system by doing an expensive conversion between coordinate systems.
0/0---X--->
|
|
Y
|
|
v
(opencv)
0/0---X---->
|
|
|
-Y
|
|
v
(4th quadrant)
Solution 3:
Here's a visual example to distinguish python's [row, columns] from OpenCV's [x,y].
import numpy as np
import matplotlib.pyplot as plt
import cv2
img = np.zeros((5,5)) # initialize empty image as numpy array
img[0,2] = 1 # assign 1 to the pixel of row 0 and column 2
M = cv2.moments(img) # calculate moments of binary image
cX = int(M["m10"] / M["m00"]) # calculate x coordinate of centroid
cY = int(M["m01"] / M["m00"]) # calculate y coordinate of centroid
img2 = np.zeros((5,5)) # initialize another empty image
img2[cX,cY] = 1 # assign 1 to the pixel with x = cX and y = cY
img3 = np.zeros((5,5)) # initialize another empty image
img3[cY,cX] = 1 # invert x and y
plt.figure()
plt.subplots_adjust(wspace=0.4) # add space between subplots
plt.subplot(131), plt.imshow(img, cmap = "gray"), plt.title("With [rows,cols]")
plt.subplot(132), plt.imshow(img2, cmap = "gray"), plt.title("With [x,y]")
plt.subplot(133), plt.imshow(img3, cmap= "gray"), plt.title("With [y,x]"), plt.xlabel('x'), plt.ylabel('y')
This will output: