News from this site

 Rental advertising space, please contact the webmaster if you need cooperation


+focus
focused

classification  

no classification

tag  

no tag

date  

no datas

Binocular vision target tracking and 3D coordinate acquisition—python (code)

posted on 2023-06-06 11:19     read(246)     comment(0)     like(29)     collect(1)


September 2022 update:

On the original basis, I used yolov5 instead of opencv's target detection algorithm to assist the camera in acquiring three-dimensional coordinates, and successfully used the acquired coordinates to control the robotic arm in real time. If you are interested, you can watch the video in my b station, below the video There is also an open source link: [Soft Core] I developed a robotic arm for binocular vision target detection_哔哩哔哩_bilibili


The following is the original answer:

After studying binocular vision for several days on csdn and station b, I can roughly realize some functions. Record your thoughts here, and record the pits you encountered by the way.

Let's take a look at the final result first, realize the tracking of objects and display the three-dimensional pixel coordinates:

 Let me talk about the specific steps

1. Camera Calibration

The premise of using a binocular camera is to first obtain the internal and external parameters of the camera. Some more expensive cameras will give you these parameters together when they leave the factory. More common cameras need to be calibrated by ourselves. I calibrated it through matlab. For specific steps, you can read this blog: Matlab binocular camera calibration_indigo love's blog-CSDN blog_matlab binocular camera calibration 

It is very detailed, and the given code can be run directly, but one detail should be paid attention to. In the matlab calibration interface, this option is for 2 parameters by default, we need to manually check three parameters, otherwise the final output camera parameters will be different

Here we first put the calibration results into a file called stereoconfig.py for later use

  1. import numpy as np
  2. class stereoCamera(object):
  3. def __init__(self):
  4. # 左相机内参
  5. self.cam_matrix_left = np.array([[684.8165, 0, 637.2704], [0, 685.4432, 320.5347],
  6. [0, 0, 1]])
  7. # 右相机内参
  8. self.cam_matrix_right = np.array([[778.2081, 0, 602.9231], [0, 781.9883, 319.6632],
  9. [0, 0, 1]])
  10. # 左右相机畸变系数:[k1, k2, p1, p2, k3]
  11. self.distortion_l = np.array([[0.1342, -0.3101, 0, 0, 0.1673]])
  12. self.distortion_r = np.array([[0.4604, -2.3963, 0, 0, 5.2266]])
  13. # 旋转矩阵
  14. self.R = np.array([[0.9993, -0.0038, -0.0364],
  15. [0.0033, 0.9999, -0.0143],
  16. [0.0365, 0.0142, 0.9992]])
  17. # 平移矩阵
  18. self.T = np.array([[-44.8076], [5.7648], [51.7586]])
  19. # 主点列坐标的差
  20. self.doffs = 0.0
  21. # 指示上述内外参是否为经过立体校正后的结果
  22. self.isRectified = False
  23. def setMiddleBurryParams(self):
  24. self.cam_matrix_left = np.array([[684.8165, 0, 637.2704], [0, 685.4432, 320.5347],
  25. [0, 0, 1]])
  26. self.cam_matrix_right = np.array([[778.2081, 0, 602.9231], [0, 781.9883, 319.6632],
  27. [0, 0, 1]])
  28. self.distortion_l = np.array([[0.1342, -0.3101, 0, 0, 0.1673]])
  29. self.distortion_r = np.array([[0.4604, -2.3963, 0, 0, 5.2266]])
  30. self.R = np.array([[0.9993, -0.0038, -0.0364],
  31. [0.0033, 0.9999, -0.0143],
  32. [0.0365, 0.0142, 0.9992]])
  33. self.T = np.array([[-44.8076], [5.7648], [51.7586]])
  34. self.doffs = 131.111
  35. self.isRectified = True

 2. About how to open two cameras with python

In fact, this should not be the main point, but it really stuck with me for a long time... so let me talk about it.

First of all, although the binocular camera has two cameras, they use the same serial number, that is to say, camera = cv2.VideoCapture(0), and the given id is 0, then it has already opened two cameras. But if you only run this line of code you can only see the left camera, why? In fact, it is not that the other camera is not open, but that your default window size is not large enough, so you can only see one camera. For a 2560×720 camera, use the following code to cut the window, and open two windows to display both cameras. . 1480 camera can refer to this blog, anyway, I mainly copied his OpenCV to open the binocular camera (python version)_一小树x的博客-CSDN Blog_opencv to open the binocular camera

I suggest to open two windows to display the left camera and the right camera respectively. Of course, you can also display the two cameras in one window, just use the corresponding method to cut the window. I won’t go into details.

  1. # -*- coding: utf-8 -*-
  2. import cv2
  3. import time
  4. AUTO = False # 自动拍照,或手动按s键拍照
  5. INTERVAL = 2 # 自动拍照间隔
  6. cv2.namedWindow("left")
  7. cv2.namedWindow("right")
  8. camera = cv2.VideoCapture(0)
  9. # 设置分辨率左右摄像机同一频率,同一设备ID;左右摄像机总分辨率2560x720;分割为两个1280x720
  10. camera.set(cv2.CAP_PROP_FRAME_WIDTH,2560)
  11. camera.set(cv2.CAP_PROP_FRAME_HEIGHT,720)
  12. counter = 0
  13. utc = time.time()
  14. folder = "./SaveImage/" # 拍照文件目录
  15. def shot(pos, frame):
  16. global counter
  17. path = folder + pos + "_" + str(counter) + ".jpg"
  18. cv2.imwrite(path, frame)
  19. print("snapshot saved into: " + path)
  20. while True:
  21. ret, frame = camera.read()
  22. print("ret:",ret)
  23. # 裁剪坐标为[y0:y1, x0:x1] HEIGHT * WIDTH
  24. left_frame = frame[0:720, 0:1280]
  25. right_frame = frame[0:720, 1280:2560]
  26. cv2.imshow("left", left_frame)
  27. cv2.imshow("right", right_frame)
  28. now = time.time()
  29. if AUTO and now - utc >= INTERVAL:
  30. shot("left", left_frame)
  31. shot("right", right_frame)
  32. counter += 1
  33. utc = now
  34. key = cv2.waitKey(1)
  35. if key == ord("q"):
  36. break
  37. elif key == ord("s"):
  38. shot("left", left_frame)
  39. shot("right", right_frame)
  40. counter += 1
  41. camera.release()
  42. cv2.destroyWindow("left")
  43. cv2.destroyWindow("right")

3. Realization of target tracking

The idea of ​​this article is to realize the target tracking of the monocular camera (that is, the left camera) first. After the target tracking is realized, the two-dimensional pixel coordinates of the left camera target are obtained, and then the two-dimensional pixel coordinates of the left camera target are added with "parallax ” to get the pixel two-dimensional coordinates of the right camera target. After obtaining the two coordinates, the least square method is used to obtain the pixel coordinates of the third dimension.

In short, first post the target tracking code of the monocular camera:

  1. import cv2
  2. vs = cv2.VideoCapture(0) # 参数0表示第一个摄像头
  3. cv2.namedWindow("Frame")
  4. # 判断视频是否打开
  5. if (vs.isOpened()):
  6. print('camera Opened')
  7. else:
  8. print('摄像头未打开')
  9. OPENCV_OBJECT_TRACKERS = {
  10. "csrt": cv2.TrackerCSRT_create, "kcf": cv2.TrackerKCF_create,
  11. "boosting": cv2.TrackerBoosting_create, "mil": cv2.TrackerMIL_create,
  12. "tld": cv2.TrackerTLD_create,
  13. "medianflow": cv2.TrackerMedianFlow_create, "mosse": cv2.TrackerMOSSE_create
  14. }
  15. trackers=cv2.MultiTracker_create()
  16. while True:
  17. frame=vs.read()
  18. frame=frame[1]
  19. if frame is None:
  20. break
  21. # 设置摄像头尺寸
  22. (h,w) = frame.shape[:2]
  23. width = 800
  24. r = width / float(w)
  25. dim = (width, int(h * r))
  26. frame = cv2.resize(frame, dim, interpolation = cv2.INTER_AREA)
  27. # 对做摄像头做目标识别初始化
  28. (success,boxes)=trackers.update(frame)
  29. # 画图的循环
  30. for box in boxes:
  31. (x,y,w,h)=[int(v) for v in box]
  32. cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
  33. cv2.imshow('Frame', frame)
  34. # 按键判断是否设置了新的目标
  35. key=cv2.waitKey(10) & 0xFF
  36. if key == ord('s'):
  37. box=cv2.selectROI('Frame', frame, fromCenter=False, showCrosshair=True)
  38. tracker=cv2.TrackerCSRT_create()
  39. print(type(box),type(box[0]),box[1],box)
  40. trackers.add(tracker, frame, box)
  41. elif key == 27:
  42. break
  43. vs.release()
  44. cv2.destroyAllWindows()

After running, a Frame window will pop up, press the button "s" and the image will be still, you can use the mouse to draw a frame, and then press the space to start the target tracking.

Fourth, the acquisition of z-axis coordinates

This should be the core part, and the idea is what I said at the beginning of the third part.

(1) Calculate the parallax

This function returns a value disp, which is disparity. The definition of disparity is disparity=ul-ur, that is, the pixel coordinates of the left image minus the pixel coordinates of the right image. The larger the parallax, the closer the point is to the camera. This is easy to understand. Think of the binocular camera as your eyes. When you keep closing one eye and opening the other, you will find Objects will move, and closer objects will move a greater distance. right?

  1. # 视差计算
  2. def stereoMatchSGBM(left_image, right_image, down_scale=False):
  3. # SGBM匹配参数设置
  4. if left_image.ndim == 2:
  5. img_channels = 1
  6. else:
  7. img_channels = 3
  8. blockSize = 3
  9. paraml = {'minDisparity': 0,
  10. 'numDisparities': 64,
  11. 'blockSize': blockSize,
  12. 'P1': 8 * img_channels * blockSize ** 2,
  13. 'P2': 32 * img_channels * blockSize ** 2,
  14. 'disp12MaxDiff': 1,
  15. 'preFilterCap': 63,
  16. 'uniquenessRatio': 15,
  17. 'speckleWindowSize': 100,
  18. 'speckleRange': 1,
  19. 'mode': cv2.STEREO_SGBM_MODE_SGBM_3WAY
  20. }
  21. # 构建SGBM对象
  22. left_matcher = cv2.StereoSGBM_create(**paraml)
  23. paramr = paraml
  24. paramr['minDisparity'] = -paraml['numDisparities']
  25. right_matcher = cv2.StereoSGBM_create(**paramr)
  26. # 计算视差图
  27. size = (left_image.shape[1], left_image.shape[0])
  28. if down_scale == False:
  29. disparity_left = left_matcher.compute(left_image, right_image)
  30. disparity_right = right_matcher.compute(right_image, left_image)
  31. else:
  32. left_image_down = cv2.pyrDown(left_image)
  33. right_image_down = cv2.pyrDown(right_image)
  34. factor = left_image.shape[1] / left_image_down.shape[1]
  35. disparity_left_half = left_matcher.compute(left_image_down, right_image_down)
  36. disparity_right_half = right_matcher.compute(right_image_down, left_image_down)
  37. disparity_left = cv2.resize(disparity_left_half, size, interpolation=cv2.INTER_AREA)
  38. disparity_right = cv2.resize(disparity_right_half, size, interpolation=cv2.INTER_AREA)
  39. disparity_left = factor * disparity_left
  40. disparity_right = factor * disparity_right
  41. # 真实视差(因为SGBM算法得到的视差是×16的)
  42. trueDisp_left = disparity_left.astype(np.float32) / 16.
  43. trueDisp_right = disparity_right.astype(np.float32) / 16.
  44. return trueDisp_left, trueDisp_right

(2) Calculation of the left and right pixels at the target

After obtaining the parallax size, the coordinates of the pixels on both sides can be calculated according to the parallax. disp is the parameter of the disparity we found before. Note here is disp (yy, xx) instead of disp (xx, yy), you can see the length and width of disp

  1. # 画图的循环,(x,y)和(x+w,y+h)是你画的框的左上角和右下角的两个坐标哈
  2. for box in boxes:
  3. (x, y, w, h)=[int(v) for v in box]
  4. cv2.rectangle(left_frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
  5. # 转化成框框中点的坐标
  6. xx = round((2*x+w)/2)
  7. yy = round((2*y+h)/2)
  8. # xr和yr是右相机相应点的像素坐标
  9. xr = xx+disp[yy, xx]
  10. yr = yy

(3) Z-axis coordinate calculation

I just know that the smaller the parallax, the deeper the depth. I don't quite understand the principle of converting parallax into real coordinates. This is someone else's code.

  1. def getDepthMapWithConfig(config : stereoconfig.stereoCamera) -> np.ndarray:
  2. fb = config.cam_matrix_left[0, 0] * (-config.T[0])
  3. doffs = config.doffs
  4. disparity=dot_disp
  5. depth = fb/(disparity + doffs)
  6. return depth

 5. Final result

I'll post all the code

  1. import cv2
  2. import argparse
  3. import numpy as np
  4. import stereoconfig
  5. # 左相机内参
  6. leftIntrinsic = np.array([[684.8165, 0, 637.2704], [0, 685.4432, 320.5347],
  7. [0, 0, 1]])
  8. # 右相机内参
  9. rightIntrinsic = np.array([[778.2081, 0, 602.9231], [0, 781.9883, 319.6632],
  10. [0, 0, 1]])
  11. # 旋转矩阵
  12. leftRotation = np.array([[1, 0, 0], # 旋转矩阵
  13. [0, 1, 0],
  14. [0, 0, 1]])
  15. rightRotation = np.array([[0.9993, -0.0038, -0.0364],
  16. [0.0033, 0.9999, -0.0143],
  17. [0.0365, 0.0142, 0.9992]])
  18. # 平移矩阵
  19. rightTranslation = np.array([[-44.8076], [5.7648], [51.7586]])
  20. leftTranslation = np.array([[0], # 平移矩阵
  21. [0],
  22. [0]])
  23. def getDepthMapWithConfig(config : stereoconfig.stereoCamera) -> np.ndarray:
  24. fb = config.cam_matrix_left[0, 0] * (-config.T[0])
  25. doffs = config.doffs
  26. disparity=dot_disp
  27. depth = fb/(disparity + doffs)
  28. return depth
  29. # 预处理
  30. def preprocess(img1, img2):
  31. # 彩色图->灰度图
  32. if (img1.ndim == 3):
  33. img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) # 通过OpenCV加载的图像通道顺序是BGR
  34. if (img2.ndim == 3):
  35. img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
  36. # 直方图均衡
  37. img1 = cv2.equalizeHist(img1)
  38. img2 = cv2.equalizeHist(img2)
  39. return img1, img2
  40. # 消除畸变
  41. def undistortion(image, camera_matrix, dist_coeff):
  42. undistortion_image = cv2.undistort(image, camera_matrix, dist_coeff)
  43. return undistortion_image
  44. # 获取畸变校正和立体校正的映射变换矩阵、重投影矩阵
  45. # @param:config是一个类,存储着双目标定的参数:config = stereoconfig.stereoCamera()
  46. def getRectifyTransform(height, width, config):
  47. # 读取内参和外参
  48. left_K = config.cam_matrix_left
  49. right_K = config.cam_matrix_right
  50. left_distortion = config.distortion_l
  51. right_distortion = config.distortion_r
  52. R = config.R
  53. T = config.T
  54. # 计算校正变换
  55. R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(left_K, left_distortion, right_K, right_distortion,
  56. (width, height), R, T, alpha=0)
  57. map1x, map1y = cv2.initUndistortRectifyMap(left_K, left_distortion, R1, P1, (width, height), cv2.CV_32FC1)
  58. map2x, map2y = cv2.initUndistortRectifyMap(right_K, right_distortion, R2, P2, (width, height), cv2.CV_32FC1)
  59. return map1x, map1y, map2x, map2y, Q
  60. # 畸变校正和立体校正
  61. def rectifyImage(image1, image2, map1x, map1y, map2x, map2y):
  62. rectifyed_img1 = cv2.remap(image1, map1x, map1y, cv2.INTER_AREA)
  63. rectifyed_img2 = cv2.remap(image2, map2x, map2y, cv2.INTER_AREA)
  64. return rectifyed_img1, rectifyed_img2
  65. # 视差计算
  66. def stereoMatchSGBM(left_image, right_image, down_scale=False):
  67. # SGBM匹配参数设置
  68. if left_image.ndim == 2:
  69. img_channels = 1
  70. else:
  71. img_channels = 3
  72. blockSize = 3
  73. paraml = {'minDisparity': 0,
  74. 'numDisparities': 64,
  75. 'blockSize': blockSize,
  76. 'P1': 8 * img_channels * blockSize ** 2,
  77. 'P2': 32 * img_channels * blockSize ** 2,
  78. 'disp12MaxDiff': 1,
  79. 'preFilterCap': 63,
  80. 'uniquenessRatio': 15,
  81. 'speckleWindowSize': 100,
  82. 'speckleRange': 1,
  83. 'mode': cv2.STEREO_SGBM_MODE_SGBM_3WAY
  84. }
  85. # 构建SGBM对象
  86. left_matcher = cv2.StereoSGBM_create(**paraml)
  87. paramr = paraml
  88. paramr['minDisparity'] = -paraml['numDisparities']
  89. right_matcher = cv2.StereoSGBM_create(**paramr)
  90. # 计算视差图
  91. size = (left_image.shape[1], left_image.shape[0])
  92. if down_scale == False:
  93. disparity_left = left_matcher.compute(left_image, right_image)
  94. disparity_right = right_matcher.compute(right_image, left_image)
  95. else:
  96. left_image_down = cv2.pyrDown(left_image)
  97. right_image_down = cv2.pyrDown(right_image)
  98. factor = left_image.shape[1] / left_image_down.shape[1]
  99. disparity_left_half = left_matcher.compute(left_image_down, right_image_down)
  100. disparity_right_half = right_matcher.compute(right_image_down, left_image_down)
  101. disparity_left = cv2.resize(disparity_left_half, size, interpolation=cv2.INTER_AREA)
  102. disparity_right = cv2.resize(disparity_right_half, size, interpolation=cv2.INTER_AREA)
  103. disparity_left = factor * disparity_left
  104. disparity_right = factor * disparity_right
  105. # 真实视差(因为SGBM算法得到的视差是×16的)
  106. trueDisp_left = disparity_left.astype(np.float32) / 16.
  107. trueDisp_right = disparity_right.astype(np.float32) / 16.
  108. return trueDisp_left, trueDisp_right
  109. # 将h×w×3数组转换为N×3的数组
  110. def hw3ToN3(points):
  111. height, width = points.shape[0:2]
  112. points_1 = points[:, :, 0].reshape(height * width, 1)
  113. points_2 = points[:, :, 1].reshape(height * width, 1)
  114. points_3 = points[:, :, 2].reshape(height * width, 1)
  115. points_ = np.hstack((points_1, points_2, points_3))
  116. return points_
  117. def getDepthMapWithQ(disparityMap: np.ndarray, Q: np.ndarray) -> np.ndarray:
  118. points_3d = cv2.reprojectImageTo3D(disparityMap, Q)
  119. depthMap = points_3d[:, :, 2]
  120. reset_index = np.where(np.logical_or(depthMap < 0.0, depthMap > 65535.0))
  121. depthMap[reset_index] = 0
  122. return depthMap.astype(np.float32)
  123. def getDepthMapWithConfig(config : stereoconfig.stereoCamera) -> np.ndarray:
  124. fb = config.cam_matrix_left[0, 0] * (-config.T[0])
  125. doffs = config.doffs
  126. disparity=dot_disp
  127. depth = fb/(disparity + doffs)
  128. return depth
  129. vs = cv2.VideoCapture(0) # 参数0表示第一个摄像头
  130. cv2.namedWindow("Frame")
  131. # 分配摄像头分辨率
  132. vs.set(cv2.CAP_PROP_FRAME_WIDTH, 2560)
  133. vs.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
  134. # 判断视频是否打开
  135. if (vs.isOpened()):
  136. print('camera Opened')
  137. else:
  138. print('摄像头未打开')
  139. OPENCV_OBJECT_TRACKERS = {
  140. "csrt": cv2.TrackerCSRT_create, "kcf": cv2.TrackerKCF_create,
  141. "boosting": cv2.TrackerBoosting_create, "mil": cv2.TrackerMIL_create,
  142. "tld": cv2.TrackerTLD_create,
  143. "medianflow": cv2.TrackerMedianFlow_create, "mosse": cv2.TrackerMOSSE_create
  144. }
  145. trackers=cv2.MultiTracker_create()
  146. # 读取相机内参和外参
  147. # 使用之前先将标定得到的内外参数填写到stereoconfig.py中的StereoCamera类中
  148. config = stereoconfig.stereoCamera()
  149. config.setMiddleBurryParams()
  150. print(config.cam_matrix_left)
  151. while True:
  152. frame=vs.read()
  153. frame=frame[1]
  154. if frame is None:
  155. break
  156. # 设置右摄像头尺寸
  157. right_frame = frame[0:720, 1280:2560]
  158. (h,w) = right_frame.shape[:2]
  159. width = 800
  160. r = width / float(w)
  161. dim = (width, int(h * r))
  162. right_frame = cv2.resize(right_frame, dim, interpolation = cv2.INTER_AREA)
  163. # 设置左摄像头尺寸
  164. left_frame = frame[0:720, 0:1280]
  165. (h,w) = left_frame.shape[:2]
  166. width = 800
  167. r = width / float(w)
  168. dim = (width, int(h * r))
  169. left_frame = cv2.resize(left_frame, dim, interpolation = cv2.INTER_AREA)
  170. # 对做摄像头做目标识别初始化
  171. (success,boxes)=trackers.update(left_frame)
  172. # 画图的循环
  173. for box in boxes:
  174. (x, y, w, h)=[int(v) for v in box]
  175. cv2.rectangle(left_frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
  176. # 转化成框框中点的坐标
  177. xx = round((2*x+w)/2)
  178. yy = round((2*y+h)/2)
  179. # 读取一帧图片
  180. iml = left_frame # 左图
  181. imr = right_frame # 右图
  182. height, width = iml.shape[0:2]
  183. # 立体校正
  184. map1x, map1y, map2x, map2y, Q = getRectifyTransform(height, width,
  185. config) # 获取用于畸变校正和立体校正的映射矩阵以及用于计算像素空间坐标的重投影矩阵
  186. iml_rectified, imr_rectified = rectifyImage(iml, imr, map1x, map1y, map2x, map2y)
  187. print(Q)
  188. # 立体匹配
  189. iml_, imr_ = preprocess(iml, imr) # 预处理,一般可以削弱光照不均的影响,不做也可以
  190. disp, _ = stereoMatchSGBM(iml, imr, False) # 这里传入的是未经立体校正的图像,因为我们使用的middleburry图片已经是校正过的了
  191. dot_disp=disp[yy][xx]
  192. cv2.imwrite('disaprity.jpg', disp * 4)
  193. # xr和yr是右相机相应点的像素坐标
  194. z=getDepthMapWithConfig(config)
  195. text = str(xx)+','+str(yy)+','+str(z)
  196. cv2.putText(left_frame, text, (x, y), cv2.FONT_HERSHEY_COMPLEX, 0.6, (0, 0, 255), 1)
  197. # 显示两个框
  198. cv2.imshow("right", right_frame)
  199. cv2.imshow('Frame', left_frame)
  200. # 按键判断是否设置新的目标
  201. key=cv2.waitKey(10) & 0xFF
  202. if key == ord('s'):
  203. box=cv2.selectROI('Frame', left_frame, fromCenter=False, showCrosshair=True)
  204. tracker=cv2.TrackerCSRT_create()
  205. print(type(box),type(box[0]),box[1],box)
  206. trackers.add(tracker, left_frame, box)
  207. elif key == 27:
  208. break
  209. vs.release()
  210. cv2.destroyAllWindows()



Category of website: technical article > Blog

Author:Fiee

link:http://www.pythonblackhole.com/blog/article/80151/b0c90d97dd7fb928307a/

source:python black hole net

Please indicate the source for any form of reprinting. If any infringement is discovered, it will be held legally responsible.

29 0
collect article
collected

Comment content: (supports up to 255 characters)