There's a point in space that represents the lens of the observer's eye, and a rectangle in space that represents the viewport. This rectangle is divided into pixel-equivalent square areas. For each area, a sampling of one or more rays is drawn from the lens point through the bounds of the area, until it encounters a surface of the scene. At that point, the material rules of the surface might generate another ray for specular reflection, a cone for diffuse reflection, another ray for refraction, and also add the emissive light value from that material. If the specular or diffuse reflections encounter a light source or ambient light, they add some of that light to the pixel-equivalent.
The diffuse cones send out a sampling of rays and attenuate the light from the light source, based on how many of those rays hit it, instead of some other object.
Instead of drawing light onto the scene and calculating how much passes through the viewport to the lens, ray-tracing cheats by working backwards, because photons traveling backward in time follow exactly the same rules as those traveling forward in time. Every photon that can travel backwards in time from the eye to hit a light source must have emanated from a light source with exactly the right direction and polarization to enter the eye. So the only photons calculated are the ones that contribute to the scene as viewed by the eye.
1) a point / pixel in the scene (as viewed by the eye) sends out a cone of rays, and the final color of this pixel is a combination of what those rays hit. This is the ray casting process, the reverse of light traveling.
2) the overall picture of the scene is the combination of pixels each calculated by the above ray casting process.
Yes. The problem with working backwards is that some optical calculations have probability elements. A photon that hits a half-silvered mirror has a 50% chance of (specular) reflecting and a 50% chance of transmitting.
So for ray-tracing, you calculate along both paths and give 50% weight to each. Every time a ray hits a triangle in the scene, the material properties determine how the various components sum up to determine the color of the pixel.
The diffuse cones send out a sampling of rays and attenuate the light from the light source, based on how many of those rays hit it, instead of some other object.
Instead of drawing light onto the scene and calculating how much passes through the viewport to the lens, ray-tracing cheats by working backwards, because photons traveling backward in time follow exactly the same rules as those traveling forward in time. Every photon that can travel backwards in time from the eye to hit a light source must have emanated from a light source with exactly the right direction and polarization to enter the eye. So the only photons calculated are the ones that contribute to the scene as viewed by the eye.