How To Calculate Distance Between Objet And Camera Lense

Last updated on July 8, 2021.

Find distance from camera to object using Python and OpenCV

A couple of days ago, Cameron, a PyImageSearch reader emailed in and asked about methods to find the distance from a photographic camera to an object/mark in an epitome. He had spent some fourth dimension researching, but hadn't plant an implementation.

I knew exactly how Cameron felt. Years ago I was working on a modest project to analyze the movement of a baseball as information technology left the bullpen'due south hand and headed for home plate.

Using motion assay and trajectory-based tracking I was able to observe/guess the brawl location in the frame of the video. And since a baseball has a known size, I was likewise able to judge the distance to home plate.

It was an interesting projection to work on, although the system was not equally accurate every bit I wanted it to be — the "move mistiness" of the ball moving so fast made information technology hard to obtain highly accurate estimates.

My project was definitely an "outlier" state of affairs, simply in general, determining the distance from a camera to a marking is actually a very well studied trouble in the figurer vision/image processing infinite. You tin can detect techniques that are very straightforward and succinct like the triangle similarity. And you tin observe methods that are complex (albeit, more accurate) using the intrinsic parameters of the camera model.

In this blog post I'll show y'all how Cameron and I came upwardly with a solution to compute the distance from our camera to a known object or marker.

Definitely give this mail a read — you won't want to miss it!

Update July 2021: Added three new sections. The first section covers improving distance measurement with camera calibration. The second section discusses stereo vision and depth cameras to measure out distance. And the final department briefly mentions how LiDAR tin be used with camera data to provide highly accurate distance measurements.

Looking for the source code to this post?

Jump Right To The Downloads Section

Triangle Similarity for Object/Marker to Camera Altitude

In social club to make up one's mind the distance from our camera to a known object or marker, we are going to employ triangle similarity.

The triangle similarity goes something like this: Let'south say we have a marker or object with a known width W. Nosotros then place this mark some altitude D from our camera. Nosotros have a picture show of our object using our camera and then measure the credible width in pixels P. This allows u.s. to derive the perceived focal length F of our camera:

F = (P x D) / W

For example, permit's say I place a standard piece of viii.5 x 11in piece of paper (horizontally; Due west = 11) D = 24 inches in front of my camera and take a photo. When I measure out the width of the slice of paper in the paradigm, I notice that the perceived width of the newspaper is P = 248 pixels.

My focal length F is then:

F = (248px x 24in) / 11in = 543.45

As I continue to movement my camera both closer and farther away from the object/marking, I can apply the triangle similarity to make up one's mind the distance of the object to the camera:

D' = (Westward ten F) / P

Again, to make this more concrete, let'south say I move my photographic camera 3 ft (or 36 inches) away from my mark and accept a photo of the same piece of paper. Through automatic image processing I am able to decide that the perceived width of the slice of paper is now 170 pixels. Plugging this into the equation nosotros at present get:

D' = (11in x 543.45) / 170 = 35in

Or roughly 36 inches, which is 3 feet.

Note: When I captured the photos for this instance my tape measure had a bit of slack in information technology and thus the results are off by roughly 1 inch. Furthermore, I also captured the photos hastily and not 100% on top of the anxiety markers on the tape measure, which added to the 1 inch fault. That all said, the triangle similarity however holds and you tin use this method to compute the altitude from an object or marker to your camera quite easily.

Make sense now?

Awesome. Let'due south movement into some lawmaking to see how finding the distance from your photographic camera to an object or marker is done using Python, OpenCV, and epitome processing and computer vision techniques.

Finding the distance from your photographic camera to object/marker using Python and OpenCV

Let's go ahead and become this projection started. Open up upward a new file, name it distance_to_camera.py , and nosotros'll get to piece of work:

# import the necessary packages from imutils import paths import numpy as np import imutils import cv2  def find_marker(paradigm): 	# convert the prototype to grayscale, blur it, and detect edges 	gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 	greyness = cv2.GaussianBlur(gray, (5, 5), 0) 	edged = cv2.Canny(grey, 35, 125)  	# find the contours in the edged image and keep the largest one; 	# we'll assume that this is our piece of newspaper in the image 	cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) 	cnts = imutils.grab_contours(cnts) 	c = max(cnts, primal = cv2.contourArea)  	# compute the bounding box of the of the paper region and render it 	return cv2.minAreaRect(c)

The showtime affair we'll do is import our necessary packages (Lines 2-five). Nosotros'll use paths from imutils to load the available images in a directory. We'll use NumPy for numerical processing and cv2 for our OpenCV bindings.

From there we define our find_marker function. This role accepts a unmarried argument, image , and is meant to be utilized to find the object we desire to compute the distance to.

In this case we are using a standard piece of 8.v ten xi inch piece of paper as our marker.

Our commencement task is to now find this piece of newspaper in the paradigm.

To to do this, we'll convert the image to grayscale, mistiness it slightly to remove high frequency racket, and apply edge detection on Lines 9-11.

After applying these steps our epitome should look something like this:

Figure 1: Applying edge detection to find our marker, which in this case is a piece of paper. — **Figure i:** Applying edge detection to detect our marker, which in this case is a slice of paper.

Every bit you can run across, the edges of our marker (the piece of paper) have clearly been reveled. Now all we need to do is notice the contour (i.e. outline) that represents the piece of paper.

We observe our marker on Lines xv and 16 past using the cv2.findContours function (taking care to handle different OpenCV versions) and so determining the contour with the largest area on Line 17.

We are making the supposition that the contour with the largest area is our piece of paper. This assumption works for this particular case, only in reality finding the mark in an paradigm is highly application specific.

In our example, unproblematic edge detection and finding the largest contour works well. We could also make this example more robust past applying profile approximation, discarding any contours that do non take 4 points (since a piece of paper is a rectangle and thus has four points), and so finding the largest iv-bespeak contour.

Annotation: More on this methodology can be found in this mail on building a kick-ass mobile document scanner.

Other alternatives to finding markers in images is to employ color, such that the color of the mark is substantially different from the rest of the scene in the paradigm. You could too employ methods like keypoint detection, local invariant descriptors, and keypoint matching to detect markers; nevertheless, these approaches are outside the scope of this article and are again, highly application specific.

Anyhow, now that we accept the contour that corresponds to our marker, nosotros render the bounding box which contains the (x, y)-coordinates and width and tiptop of the box (in pixels) to the calling office on Line 20.

Let's likewise speedily to define a function that computes the distance to an object using the triangle similarity detailed in a higher place:

def distance_to_camera(knownWidth, focalLength, perWidth): 	# compute and return the distance from the maker to the camera 	render (knownWidth * focalLength) / perWidth

This role takes a knownWidth of the mark, a computed focalLength , and perceived width of an object in an paradigm (measured in pixels), and applies the triangle similarity detailed above to compute the actual distance to the object.

To see how we utilise these functions, continue reading:

# initialize the known distance from the camera to the object, which # in this example is 24 inches KNOWN_DISTANCE = 24.0  # initialize the known object width, which in this case, the piece of # paper is 12 inches wide KNOWN_WIDTH = 11.0  # load the furst prototype that contains an object that is KNOWN TO BE 2 anxiety # from our camera, then find the newspaper marking in the image, and initialize # the focal length image = cv2.imread("images/2ft.png") marker = find_marker(epitome) focalLength = (mark[1][0] * KNOWN_DISTANCE) / KNOWN_WIDTH

The first step to finding the altitude to an object or marking in an paradigm is to calibrate and compute the focal length. To do this, nosotros need to know:

The distance of the camera from an object.
The width (in units such equally inches, meters, etc.) of this object. Note: The height could besides be utilized, but this example simply uses the width.

Let's also have a second and mention that what nosotros are doing is not true camera calibration. True camera calibration involves the intrinsic parameters of the camera, which you tin read more than on here.

On Line 28 nosotros initialize our known KNOWN_DISTANCE from the camera to our object to be 24 inches. And on Line 32 nosotros initialize the KNOWN_WIDTH of the object to exist 11 inches (i.due east. a standard 8.5 x eleven inch slice of paper laid out horizontally).

The side by side step is important: it's our simple calibration step.

We load the first epitome off disk on Line 37 — we'll exist using this image as our calibration epitome.

Once the prototype is loaded, we detect the piece of paper in the image on Line 38, and so compute our focalLength on Line 39 using the triangle similarity.

Now that we take "calibrated" our system and have the focalLength , we can compute the distance from our camera to our marker in subsequent images quite easily.

Allow'due south meet how this is done:

# loop over the images for imagePath in sorted(paths.list_images("images")): 	# load the paradigm, find the marker in the prototype, then compute the 	# distance to the mark from the camera 	epitome = cv2.imread(imagePath) 	marker = find_marker(image) 	inches = distance_to_camera(KNOWN_WIDTH, focalLength, mark[1][0])  	# depict a bounding box effectually the image and display it 	box = cv2.cv.BoxPoints(marker) if imutils.is_cv2() else cv2.boxPoints(mark) 	box = np.int0(box) 	cv2.drawContours(image, [box], -1, (0, 255, 0), 2) 	cv2.putText(image, "%.2fft" % (inches / 12), 		(image.shape[1] - 200, image.shape[0] - 20), cv2.FONT_HERSHEY_SIMPLEX, 		2.0, (0, 255, 0), 3) 	cv2.imshow("epitome", image) 	cv2.waitKey(0)

We showtime looping over our image paths on Line 42.

And so, for each image in the list, we load the prototype off deejay on Line 45, discover the marker in the prototype on Line 46, and then compute the distance of the object to the camera on Line 47.

From there, we simply describe the bounding box effectually our marker and display the altitude on Lines l-57 (the boxPoints are calculated on Line fifty taking intendance to handle different OpenCV versions).

Results

To meet our script in activity, open up a terminal, navigate to your lawmaking directory, and execute the following command:

$ python distance_to_camera.py

If all goes well you should first see the results of 2ft.png , which is the image we apply to "calibrate" our system and compute our initial focalLength :

Figure 2: This image is used to compute the initial focal length of the system. We start by utilizing the known width of the object/marker in the image and the known distance to the object. — **Figure 2:** This epitome is used to compute the initial focal length of the system. We start by utilizing the known width of the object/marker in the image and the known distance to the object.

From the to a higher place image nosotros can run across that our focal length is properly determined and the altitude to the slice of newspaper is 2 feet, per the KNOWN_DISTANCE and KNOWN_WIDTH variables in the code.

Now that we have our focal length, we can compute the distance to our marker in subsequent images:

Figure 3: Utilizing the focal length to determine that our piece of paper marker is roughly 3 feet from our camera. — **Effigy iii:** Utilizing the focal length to determine that our piece of paper marker is roughly three feet from our camera.

In above example our camera is now approximate 3 feet from the mark.

Let's try moving back another human foot:

Utilizing the computed focal length to determine w are roughly 4 feet from our marker. — **Figure iv:** Utilizing the computed focal length to determine our camera is roughly 4 feet from our marker.

Once more, it's important to note that when I captured the photos for this example I did so hastily and left besides much slack in the record measure. Furthermore, I also did not ensure my camera was 100% lined up on the foot markers, so again, there is roughly 1 inch mistake in these examples.

That all said, the triangle similarity approach detailed in this article will still work and allow you to observe the distance from an object or marking in an image to your camera.

Improving distance measurement with camera scale

In order to perform distance measurement, we beginning need to calibrate our system. In this post, nosotros used a simple "pixels per metric" technique.

However, better accuracy tin be obtained past performing a proper camera scale past computing the extrinsic and intrinsic parameters:

Extrinsic parameters are rotation and translation matrices used to convert something from the world frame to the camera frame
Intrinsic parameters are the internal camera parameters, such every bit the focal length, to convert that information into a pixel

The most common manner is to perform a checkerboard photographic camera calibration using OpenCV. Doing so will remove radial distortion and tangential distortion, both of which impact the output image, and therefore the output measurement of objects in the image.

Hither are some resources to help you get started with camera scale:

Understanding Lens Baloney
Photographic camera Calibration using OpenCV
Camera Calibration (official OpenCV documentation)

Stereo vision and depth cameras for altitude measurement

As humans, we accept having two eyes for granted. Even if we lost an eye in an accident nosotros could still see and perceive the earth effectually us.

Yet, with only ane middle we lose important data — depth .

Depth perception gives us the perceived distance from where we stand to the object in front of u.s.a.. In order to perceive depth, we need two eyes.

Most cameras on the other paw only have one heart, which is why it's so challenging to obtain depth information using a standard second camera (although yous tin utilise deep learning models to endeavour to "larn depth" from a 2nd image).

You tin create a stereo vision system capable of computing depth information using two cameras, such as USB webcams:

Introduction to Epipolar Geometry and Stereo Vision
Making A Low-Cost Stereo Photographic camera Using OpenCV
Depth perception using stereo camera (Python/C++)

Or, yous could utilize specific hardware that has depth cameras built in, such every bit OpenCV's AI Kit (OAK-D).

LiDAR for depth data

For more accurate depth data you should consider using LiDAR. LiDAR uses calorie-free sensors to measure the distance between the sensor and whatever object(s) in front of it.

LiDAR is peculiarly popular in cocky-driving cars where a photographic camera is simply not enough.

We embrace how to use LiDAR for self-driving car applications inside PyImageSearch University.

What's next? I recommend PyImageSearch University.

Form information:
35+ total classes • 39h 44m video • Final updated: Feb 2022
★★★★★ 4.84 (128 Ratings) • three,000+ Students Enrolled

I strongly believe that if yous had the right teacher y'all could master estimator vision and deep learning.

Practise you call back learning computer vision and deep learning has to exist fourth dimension-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That's not the instance.

All you need to master calculator vision and deep learning is for someone to explicate things to you in simple, intuitive terms. And that'south exactly what I practice. My mission is to change pedagogy and how complex Artificial Intelligence topics are taught.

If you're serious well-nigh learning figurer vision, your adjacent stop should exist PyImageSearch University, the about comprehensive computer vision, deep learning, and OpenCV grade online today. Here you lot'll learn how to successfully and confidently apply figurer vision to your piece of work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch Academy y'all'll find:

✓ 35+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 35+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every calendar month , ensuring you can keep upwards with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy 1-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this blog mail nosotros learned how to determine the distance from a known object in an image to our camera.

To achieve this chore we utilized the triangle similarity, which requires u.s.a. to know two important parameters prior to applying our algorithm:

The width (or height) in some distance measure, such every bit inches or meters, of the object we are using as a marker.
The altitude (in inches or meters) of the camera to the marker in step 1.

Computer vision and image processing algorithms can then be used to automatically determine the perceived width/height of the object in pixels and consummate the triangle similarity and give us our focal length.

Then, in subsequent images we simply need to observe our marking/object and use the computed focal length to determine the distance to the object from the camera.

Download the Source Code and Gratis 17-page Resource Guide

Enter your electronic mail address below to get a .zilch of the lawmaking and a Free 17-folio Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside yous'll discover my hand-picked tutorials, books, courses, and libraries to help you primary CV and DL!