A camera is configured to output a test depth+multi-spectral image including a plurality of pixels. Each pixel corresponds to one of the plurality of sensors of a sensor array of the camera and includes at least a depth value and a spectral value for each spectral light sub-band of a plurality of spectral illuminators of the camera. An object recognition machine is previously trained with a set of labeled training depth+multi-spectral images having a same structure as the test depth+multi-spectral image. The object recognition machine is configured to output a confidence value indicating a likelihood that the test depth+multi-spectral image includes a specified object.