research-article Free Access
- Authors:
- DaoQing Liao https://ror.org/0530pts50School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
https://ror.org/0530pts50School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Search about this author
- Wei Ai https://ror.org/0530pts50School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
https://ror.org/0530pts50School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Search about this author
Journal of Real-Time Image ProcessingVolume 21Issue 2Apr 2024https://doi.org/10.1007/s11554-023-01412-6
Published:09 February 2024Publication History
- 0citation
- 0
- Downloads
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
- Get Citation Alerts
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
- Publisher Site
Journal of Real-Time Image Processing
Volume 21, Issue 2
PreviousArticleNextArticle
Abstract
Abstract
In numerous robotic and autonomous driving tasks, traditional visual SLAM algorithms estimate the camera’s position in a scene through sparse feature points and express the map by estimating the depth of sparse point clouds. However, practical applications require SLAM to create dense maps in real time, overcoming the sparsity and occlusion issues of point clouds. Furthermore, it is advantageous for SLAM map to possess an auto-completion capability, where the map can automatically infer and complete the remaining 20% when the camera observes only 80% of an object. Therefore, a more dense and intelligent map representation is needed. In this paper, we propose a Visual–Inertial SLAM with Neural Radiance Fields reconstruction to address the aforementioned challenges. We integrate the traditional rule-based optimization with NeRF. This approach allows for the real-time update of NeRF local functions by rapidly estimating camera motion and sparse feature point depths to reconstruct 3D scenes. To achieve better camera poses and globally consistent map, we address the issue of IMU noise spikes resulting from rapid motion changes, along with handling pose adjustments due to loop closure fusion. Specifically, we employ a form of widening the static noise covariance to refit the dynamic noise covariance. During loop closure fusion, we treat the pose adjustment between pre- and post-loop closure as a spatiotemporal transformation, migrating NeRF parameters from pre- to post- to expedite loop closure adjustments in NeRF mapping. Moreover, we extend this method to scenarios with only grayscale images. By expanding the color channels of grayscale images and conducting linear spatial mapping, we can rapidly reconstruct 3D scenes with only grayscale images. We demonstrate the precision and speed advantages of our method in both RGB and grayscale scenes.
References
- 1. Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. ICCV (2021)Google Scholar
- 2. Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. CVPR (2022)Google Scholar
- 3. Bhalgat, Y., Laina, I., Henriques, J.F., Zisserman, A., Vedaldi, A.: Contrastive lift: 3D object instance segmentation by slow-fast contrastive fusion. Preprint arXiv:2306.04633 (2023)Google Scholar
- 4. Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M.W., Siegwart, R.: The Euroc micro aerial vehicle datasets. Int. J. Robot. Res. (2016). DOI: https://doi.org/10.1177/0278364915620033. https://ijr.sagepub.com/content/early/2016/01/21/0278364915620033.abstractGoogle Scholar
Digital Library
- 5. Campos CElvira RRodríguez JJGMontiel JMTardós JDOrb-slam3: an accurate open-source library for visual, visual-inertial, and multimap slamIEEE Trans. Rob.20213761874189010.1109/TRO.2021.3075644Google Scholar
Cross Ref
- 6. Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: Mvsnerf: fast generalizable radiance field reconstruction from multi-view stereo, pp. 14124–14133 (2021)Google Scholar
- 7. Chen, Z.: Im-net: learning implicit fields for generative shape modeling (2019)Google Scholar
- 8. Chung, C.M., Tseng, Y.C., Hsu, Y.C., Shi, X.Q., Hua, Y.H., Yeh, J.F., Chen, W.C., Chen, Y.T., Hsu, W.H.: Orbeez-slam: a real-time monocular visual slam with orb features and nerf-realized mapping. Preprint arXiv:2209.13274 (2022)Google Scholar
- 9. Clark, R.: Volumetric bundle adjustment for online photorealistic scene capture, pp. 6124–6132 (2022)Google Scholar
- 10. Crassidis JLSigma-point Kalman filtering for integrated GPS and inertial navigationIEEE Trans. Aerosp. Electron. Syst.200642275075610.1109/TAES.2006.1642588Google Scholar
Cross Ref
- 11. Dai ANießner MZollhöfer MIzadi STheobalt CBundlefusion: real-time globally consistent 3d reconstruction using on-the-fly surface reintegrationACM Trans Graph (ToG)2017364110.1145/3072959.3054739Google Scholar
Digital Library
- 12. DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)Google Scholar
- 13. Forster CCarlone LDellaert FScaramuzza DOn-manifold preintegration for real-time visual-inertial odometryIEEE Trans. Rob.201633112110.1109/TRO.2016.2597321Google Scholar
Digital Library
- 14. Godard, C., MacAodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency, pp. 270–279 (2017)Google Scholar
- 15. Godard, C., MacAodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation, pp. 3828–3838 (2019)Google Scholar
- 16. Koestler, L., Yang, N., Zeller, N., Cremers, D.: Tandem: tracking and dense mapping in real-time using deep multi-view stereo, pp. 34–45 (2022)Google Scholar
- 17. Leutenegger SFurgale PRabaud VChli MKonolige KSiegwart RKeyframe-based visual-inertial slam using nonlinear optimizationProc. Robot. Sci. Syst. (RSS)201320131Google Scholar
- 18. Leutenegger SLynen SBosse MSiegwart RFurgale PKeyframe-based visual–inertial odometry using nonlinear optimizationInt. J. Robot. Res.201534331433410.1177/0278364914554813Google Scholar
Digital Library
- 19. Li, J., Feng, Z., She, Q., Ding, H., Wang, C., Lee, G.H.: Mine: Towards continuous depth MPI with nerf for novel view synthesis, pp. 12578–12588 (2021)Google Scholar
- 20. Li MMourikis AIHigh-precision, consistent EKF-based visual-inertial odometryInt. J. Robot. Res.201332669071110.1177/0278364913481251Google Scholar
Digital Library
- 21. Li, Z., Wang, Q., Cole, F., Tucker, R., Snavely, N.: Dynibar: neural dynamic image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4273–4284 (2023)Google Scholar
- 22. Lin, C.H., Ma, W.C., Torralba, A., Lucey, S.: Barf: bundle-adjusting neural radiance fields (2021)Google Scholar
- 23. Lindenberger, P., Sarlin, P.E., Pollefeys, M.: Lightglue: local feature matching at light speed. Preprint arXiv:2306.13643 (2023)Google Scholar
- 24. Lupton, T., Sukkarieh, S.: Visual–inertial-aided navigation for high-dynamic motion in built environments without initial conditions. IEEE Trans. Robot. (2011). DOI: https://doi.org/10.1109/tro.2011.2170332. http://dx.doi.org/10.1109/tro.2011.2170332Google Scholar
Digital Library
- 25. Martin-Brualla, R., Radwan, N., Sajjadi, M.S.M., Barron, J.T., Dosovitskiy, A., Duckworth, D.: NeRF in the wild: neural radiance fields for unconstrained photo collections (2021)Google Scholar
- 26. Meng, X., Chen, W., Yang, B.: Neat: learning neural implicit surfaces with arbitrary topologies from multi-view images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 248–258 (2023)Google Scholar
- 27. Mildenhall BSrinivasan PPTancik MBarron JTRamamoorthi RNg RNeRF: representing scenes as neural radiance fields for view synthesisCommun. ACM20216519910610.1145/3503250Google Scholar
Digital Library
- 28. Mourikis, A.I., Roumeliotis, S.I.: A multi-state constraint Kalman filter for vision-aided inertial navigation, pp. 3565–3572 (2007)Google Scholar
- 29. Müller TEvans ASchied CKeller AInstant neural graphics primitives with a multiresolution hash encodingACM Trans. Graph. (ToG)202241411510.1145/3528223.3530127Google Scholar
Digital Library
- 30. Ortiz, J., Clegg, A., Dong, J., Sucar, E., Novotny, D., Zollhoefer, M., Mukadam, M.: isdf: Real-time neural signed distance fields for robot perception. Preprint arXiv:2204.02296 (2022)Google Scholar
- 31. Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: learning continuous signed distance functions for shape representation, pp. 165–174 (2019)Google Scholar
- 32. Paul, M.K., Roumeliotis, S.I.: Alternating-stereo vins: observability analysis and performance evaluation, pp. 4729–4737 (2018)Google Scholar
- 33. Paul, M.K., Wu, K., Hesch, J.A., Nerurkar, E.D., Roumeliotis, S.I.: A comparative analysis of tightly-coupled monocular, binocular, and stereo vins, pp. 165–172 (2017)Google Scholar
- 34. Prisacariu, V.A., Kähler, O., Golodetz, S., Sapienza, M., Cavallari, T., Torr, P.H., Murray, D.W.: Infinitam v3: A framework for large-scale 3d reconstruction with loop closure. arXiv preprint arXiv:1708.00783 (2017)Google Scholar
- 35. Qin TLi PShen SVins-mono: a robust and versatile monocular visual-inertial state estimatorIEEE Trans. Rob.20183441004102010.1109/TRO.2018.2853729Google Scholar
Digital Library
- 36. Qin, T., Pan, J., Cao, S., Shen, S.: A general optimization-based framework for local odometry estimation with multiple sensors. Preprint arXiv:1901.03638 (2019)Google Scholar
- 37. Rosinol, A., Leonard, J.J., Carlone, L.: NeRF-SLAM: real-time dense monocular slam with neural radiance fields. Preprint arXiv:2210.13641 (2022)Google Scholar
- 38. Straub, J., Whelan, T., Ma, L., Chen, Y., Wijmans, E., Green, S., Engel, J.J., Mur-Artal, R., Ren, C., Verma, S., etal.: The replica dataset: a digital replica of indoor spaces. Preprint arXiv:1906.05797 (2019)Google Scholar
- 39. Sucar, E., Liu, S., Ortiz, J., Davison, A.J.: imap: implicit mapping and positioning in real-time, pp. 6229–6238 (2021)Google Scholar
- 40. Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. CVPR (2022)Google Scholar
- 41. Teed ZDeng JDroid-slam: deep visual slam for monocular, stereo, and RGB-D camerasAdv. Neural. Inf. Process. Syst.2021341655816569Google Scholar
- 42. Tretschk, E., Tewari, A., Golyanik, V., Zollhöfer, M., Lassner, C., Theobalt, C.: Non-rigid neural radiance fields: reconstruction and novel view synthesis of a dynamic scene from monocular video. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12959–12970 (2021)Google Scholar
- 43. Wang, P., Liu, Y., Chen, Z., Liu, L., Liu, Z., Komura, T., Theobalt, C., Wang, W.: F-nerf: fast neural radiance field training with free camera trajectories. Preprint arXiv:2303.15951 (2023)Google Scholar
- 44. Wang, Y., Han, Q., Habermann, M., Daniilidis, K., Theobalt, C., Liu, L.: Neus2: fast learning of neural implicit surfaces for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3295–3306 (2023)Google Scholar
- 45. Whelan, T., Leutenegger, S., Salas-Moreno, R., Glocker, B., Davison, A.: Elasticfusion: Dense slam without a pose graph (2015)Google Scholar
- 46. Yen-Chen, L., Florence, P., Barron, J.T., Rodriguez, A., Isola, P., Lin, T.Y.: inerf: inverting neural radiance fields for pose estimation. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1323–1330. IEEE (2021)Google Scholar
- 47. Yu, A., Fridovich-Keil, S., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. Preprint arXiv:2112.05131 (2021)Google Scholar
- 48. Zhi, S., Laidlow, T., Leutenegger, S., Davison, A.J.: In-place scene labelling and understanding with implicit scene representation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15838–15847 (2021)Google Scholar
- 49. Zhu, Z., Peng, S., Larsson, V., Xu, W., Bao, H., Cui, Z., Oswald, M.R., Pollefeys, M.: Nice-slam: neural implicit scalable encoding for slam, pp. 12786–12796 (2022)Google Scholar
Cited By
View all
Recommendations
- Automatic Relocalization and Loop Closing for Real-Time Monocular SLAM
Monocular SLAM has the potential to turn inexpensive cameras into powerful pose sensors for applications such as robotics and augmented reality. We present a relocalization module for such systems which solves some of the problems encountered by ...
Read More
- Instant Outdoor Localization and SLAM Initialization from 2.5D Maps
We present a method for large-scale geo-localization and global tracking of mobile devices in urban outdoor environments. In contrast to existing methods, we instantaneously initialize and globally register a SLAM map by localizing the first keyframe with ...
Read More
- Real-time Omnidirectional Visual SLAM with Semi-Dense Mapping
2018 IEEE Intelligent Vehicles Symposium (IV)
The state of art Visual SLAM is going from sparse feature to semi-dense feature to provide more information for environment perception, whereas the semi-dense methods often suffer from inaccurate depth map estimation and are easy to become instable for ...
Read More
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Article
- Information
- Contributors
Published in
Journal of Real-Time Image Processing Volume 21, Issue 2
Apr 2024
529 pages
ISSN:1861-8200
Issue’s Table of Contents
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Sponsors
In-Cooperation
Publisher
Springer-Verlag
Berlin, Heidelberg
Publication History
- Published: 9 February 2024
- Accepted: 30 December 2023
- Received: 5 December 2023
Author Tags
- NeRF
- SLAM
- Intelligent map
- Real-time online algorithm
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics
- Bibliometrics
- Citations0
Article Metrics
- View Citations
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet
Digital Edition
View this article in digital edition.
View Digital Edition
- Figures
- Other
Close Figure Viewer
Browse AllReturn
Caption
View Issue’s Table of Contents