Neuroscientists have been trying to understand how the brain processes visual information for over a century. The development of computational models inspired by the brain's layered organization, also ...
May. 2nd, 2024: Vision Mamba (Vim) is accepted by ICML2024. 🎉 Conference page can be found here. Feb. 10th, 2024: We update Vim-tiny/small weights and training scripts. By placing the class token at ...
Abstract: Real-time dense mapping with high-fidelity textures in large-scale environments is such a challenge in robots, digital twins, and AR/VR applications. Neural Radiance Field (NeRF) has ...
Abstract: Vision-and-Language Navigation (VLN) agents are tasked with navigating an unseen environment using natural language instructions. In this work, we study if visual representations of ...