Alexandre Duval, Cognition, (2019) 
This review is relevant to the debate on these wiki pages because it claims to identify evidence that discriminates between the predictions of a 'geometric module' (3D reconstruction) and 'view-based' models. Although the review is long, the key experiment is one by Julian et al (2015) which shows (very beautifully) that mice could discriminate between environments on the basis of features marked on the walls (vertical or horizontal stripes on one wall) but then seemed to ignore these very same features when picking a corner to hunt for food (because they looked equally often in diagonally opposite corners). At the same time, they must be sensitive to geometric features (e.g. 'a long wall should be on my left and a short wall in front of me') in order to explain where the mice looked.
It is a nice problem, clearly explained by Duval. What it shows is that the mice use the discriminative features (horizontal or vertical stripes, in this case) to identify the environment (broad context) and then use more specific features (whether a long wall is on their left or right, in this case) to choose whether to hunt for food (fine context). It is interesting and somewhat perplexing that the strong visible features play so little role in the fine context compared to the (ambiguous) information about a long wall on one side but information from vibrissae and running are very important to mice while visual information is less (so relative to humans). Whatever the reason, the horizontal and vertical stripes are less effective than the wall-length in specifying the fine context although they are effective in discriminating the broader context.
Duval gets round the problem by inventing two mechanisms that exactly match the two phases of the problem (recognising the appropriate environment and then finding the food location), just as people who work on short-range and long-range motion invent short-range and long-range motion mechanisms, people who work on first-order and second-order motion invent first-order and second-order motion mechanisms, people who work on random dot stereograms invent cyclopean mechanisms, people who work on sinewave stimuli invent spatial frequency channels, etc.
The relevant point for these pages is whether there is a convincing challenge to the idea of a representation that avoids 3D reconstruction. Duval means something different by 'a view-based theory' than the model advocated here. He says (p11): "if VM theorists assume that subjects systematically record any depth information ... they risk implicitly smuggling into their accounts the assumption that subjects rely on geometric representations to select the relevant snapshot." The account given in these pages includes information about distance, slant and depth relief (without recourse to explicit 3D coordinate frames) and so is different from the model Duval attacks.
Although it may seem odd that the length of the adjacent wall should be such a strong cue compared to the visual cues on the wall, it is not conclusive evidence about whether the brain uses a 3D reconstruction or a representation that avoids 3D reconstruction since the evidence can be incorporated into either model.
Perhaps the most striking message from the Julian et al experiment is that any model in which the pattern of vertical stripes in the figure above has an explicit location in the representation of the scene must explain why that location is then ignored in the food-finding task. Duval sets out this case very clearly.
- Duval, D. (2019) The representation selection problem: Why we should favor the geometric module framework of spatial reorientation over the view-matching framework. https://doi.org/10.1016/j.cognition.2019.05.022.
- Julian, J. B., Keinath, A. T., Muzzio, I. A., & Epstein, R. A. (2015). Place recognition and heading retrieval are mediated by dissociable cognitive systems in mice. Proceedings of the National Academy of Sciences, 112(20), 6503-6508.