Improving satellite pose estimation across domain gap with generative adversarial networks

Improving satellite pose estimation across domain gap with generative adversarial networks

Alessandro Lotti

download PDF

Abstract. Pose estimation from a monocular camera is a critical technology for in-orbit servicing missions. However, collecting large image datasets in space for training neural networks is impractical, resulting in the use of synthetic images. Unfortunately, these often fail to accurately replicate real image features, leading to a significant domain gap. This work explores the use of generative adversarial networks as a solution for bridging this gap from the data level by making synthetic images more closely resemble real ones. A generative model is trained on a small subset of unpaired synthetic and real pictures from the SPEED+ dataset. The entire synthetic dataset is then augmented using the generator, and employed to train a regression model, based on the MetaFormer architecture, which locates a set of landmarks. By comparing the model’s pose estimation accuracy on real images with and without generator preprocessing, it is observed that the augmentation effectively reduces the median pose estimation error by a factor 1.4 to 5. This compelling result validates the efficacy of these tools and justifies further research in their utilization.

Pose Estimation, Computer Vision, Vision-Based Navigation, Domain Gap, Generative Adversarial Networks

Published online 9/1/2023, 6 pages
Copyright © 2023 by the author(s)
Published under license by Materials Research Forum LLC., Millersville PA, USA

Citation: Alessandro Lotti, Improving satellite pose estimation across domain gap with generative adversarial networks, Materials Research Proceedings, Vol. 33, pp 376-381, 2023


The article was published as article 55 of the book Aerospace Science and Engineering

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

[1] B. Chen, J. Cao, A. Parra, T.-J. Chin, Satellite Pose Estimation with Deep Landmark Regression and Nonlinear Pose Refinement, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, IEEE, New York, 2019, pp. 2816–2824.
[2] S., Sharma, S. D’Amico, Pose Estimation for Non-Cooperative Rendezvous Using Neural Networks, 2019 AAS/AIAA Astrodynamics Specialist Conference, American Astronautical Society Paper 19-350, Springfield, VA, 2019, pp. 1–20.
[3] P. F. Proenca, Y. Gao, Deep Learning for Spacecraft Pose Estimation from Photorealistic Rendering, 2020 IEEE International Conference on Robotics and Automation, IEEE, New York, 2020, pp. 6007–6013.
[4] A. Lotti, D. Modenini, P. Tortora, M. Saponara, and M. A. Perino, Deep Learning for Real-Time Satellite Pose Estimation on Tensor Processing Units, Journal of Spacecraft and Rockets, 2023, pp. 1–5.
[5] J. Wang, C. Lan, C. Liu, Y. Ouyang,T. Qin, W. Lu, Y. Chen, W. Zeng, P.S. Yu, Generalizing to Unseen Domains: A Survey on Domain Generalization, IEEE Transactions on Knowledge and Data Engineering.
[6] T. H. Park, M. Märtens, M. Jawaid, Z. Wang, B. Chen, T.-J. Chin, D. Izzo, S. D’Amico, Satellite Pose Estimation Competition 2021: Results and Analyses, Acta Astronautica, vol. 204, 2023, pp. 640–665.

[7] T. H. Park, M. Martens, G. Lecuyer, D. Izzo, and S. D’Amico, SPEED+: Next-Generation Dataset for Spacecraft Pose Estimation across Domain Gap, IEEE Aerospace Conference (AERO), Big Sky, MT, USA, 2022, pp. 1–15.

[8] T. Park, A. A. Efros, R. Zhang, and J.-Y. Zhu, Contrastive Learning for Unpaired Image-to-Image Translation, European Conference on Computer Vision (ECCV), Springer, 2020, pp. 319–345.
[9] J. Johnson, A. Alahi, and L. Fei-Fei, Perceptual Losses for Real-Time Style Transfer and Super-Resolution, European Conference on Computer Vision (ECCV), Springer, 2016, pp 694-711.
[10] W. Yu, C. Si, P. Zhou, M. Luo, Y. Zhou, J. Feng, S. Yan, X. Wang, MetaFormer Baselines for Vision. arXiv, 2022.
[11] W. Yu, M. Luo, P. Zhou, C. Si, Y. Zhou, X. Wang, J. Feng, S. Yan, MetaFormer Is Actually What You Need for Vision, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp 10819-10829.
[12] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems, 2017, pp 5998–6008.
[13] C. Zhang, M. Zhang, S. Zhang, D. Jin, Q. Zhou, Z. Cai, H. Zhao, X. Liu, Z. Liu, Delving Deep into the Generalization of Vision Transformers under Distribution Shifts, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp 7277-7286.
[14] A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems, Vol. 25, No. 2, 2012, pp. 1106–1114.
[15] V. Lepetit, F. Moreno-Noguer, and P. Fua, EPnP: An Accurate O(n) Solution to the PnP Problem, International Journal of Computer Vision, Vol. 81, No. 2, 2009, pp. 155–166.

[16] T.H. Park, S. D’Amico, Robust Multi-Task Learning and Online Refinement for Space-craft Pose Estimation across Domain Gap, 2022. ArXiv: abs/2203.04275