What Is OpenGL® Picking?

OpenGL® picking in computer programming is the process of determining what object within a three-dimensional (3D) scene is located at a given point on the screen once the scene is rendered. It also can refer to locating multiple objects at a point or inside a box. Most often, OpenGL® picking is used to determine what 3D object on the screen a user is attempting to select with a mouse cursor. While this operation might appear to be simple, there are several subtleties in how OpenGL® renders a scene that can make it fairly complex. Additionally, there are intrinsic glitches in certain graphics cards and drivers that can cause the OpenGL® picking function to fail and return false results.

When a user is looking at a 3D scene on a computer monitor, the resulting image is known as a rendering of the scene. The scene is actually stored in memory as a collection of primitive shapes or polygons, which themselves are just collections of 3D points within the space of the scene. The computer uses world coordinates, which are sometimes called absolute coordinates, to perform most basic functions that manipulate objects in the scene. In most applications, the user is able to maneuver the view of the scene to different angles so objects can be seen in different perspectives. The virtual location of the user within the scene is called the camera angle or camera position.

The complexity of OpenGL® picking comes from determining the location of the mouse on the two-dimensional (2D) screen from a possibly arbitrary position and angle within the scene, the camera position. Additionally, because the rendering from the perspective of the human viewer is really 2D, there is no way for the user to provide the depth of the mouse click inside of the scene. The OpenGL® picking function solves this complex problem in two ways.

The first is that, instead of performing a series of separate calculations to translate where the viewer is abstractly and then find an object in the rendering window, the function actually renders the scene as it does when normally working, with the exception that the rendering used for selection is not displayed, it is only used to calculate the correct positions of objects. The difference is that, instead of rendering the whole area that would be visible to the user, it only renders the area where the mouse is located. This means any objects rendered are technically at the point where the mouse pointer is located.

The second problem, namely having no way to indicate the depth of an area selected, is solved by returning all objects that are under the mouse coordinates in the scene. The OpenGL® picking function returns all of the objects in an array along with how far away they are from the viewer’s location. This allows a program to quickly find the closest object if desired.

One way to visualize OpenGL® picking is to imagine a line, sometimes referred to as a ray in 3D programming, moving from the location of the mouse pointer into the scene and away from the viewer’s location. Each object this ray touches is added to an array of objects, along with how far away it is from the viewer. This is a very simple explanation of how one form of OpenGL® picking works.
Another method of object picking in OpenGL® involves locating an object by color, and it can be considerably faster. This method renders the scene but, instead of applying lighting and texture to the objects, they are instead rendered with a single, simple color. Each object or group of objects has its own distinct color. The scene is only rendered in memory and not displayed, so this does not affect what the user sees. Instead of looking for 3D collisions between objects, the color at the position of the mouse cursor is returned instead, and that color will correlate to a specific object.