Larger translations of the camera/eye, e.g. in navigation, are generally dealt with in computer vision by building a 3D reconstruction (Simultaneous Localisation and Mapping, SLAM). Biologists have explored view-based approaches and compared these to 3D reconstruction approaches. Mallot, Bülthoff and colleagues were among the first to advocate view-based approaches as a model of human navigation. The aim in these pages on 3D vision is to describe how a system of rotations and translations of the camera (small and large) could be united in a common framework based only on images related by motor outputs. This framework needs to explain not only the rules for navigating between one image and the next (and so on) but also the ability to discriminate different surface slants, relative depths and locations of objects viewed from a wide variety of vantage points.
Back to 3D vision.
- Davison, A. J., Reid, I. D., Molton, N. D., & Stasse, O. (2007). MonoSLAM: Real-time single camera SLAM. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 29(6), 1052-1067.
- Newman, P., & Ho, K. (2005, April). SLAM-loop closing with visually salient features. In Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on (pp. 635-642). IEEE.
- Franz, M. O., Schölkopf, B., Mallot, H. A., & Bülthoff, H. H. (1998). Where did I take that snapshot? Scene-based homing by image matching. Biological Cybernetics, 79(3), 191-202.
- Gillner, S., & Mallot, H. A. (1998). Navigation and acquisition of spatial knowledge in a virtual maze. Journal of Cognitive Neuroscience, 10(4), 445-463.