Re-se-arch
Our re-se-arch has been generously supported by ARO, NSF, ARFL, IARPA, BlueHalo and Salesforce.
2024
Xue, Nan; Tan, Bin; Xiao, Yuxi; Dong, Liang; Xia, Gui-Song; Wu, Tianfu; Shen, Yujun
NEAT: Distilling 3D Wireframes from Neural Attraction Fields Proceedings Forthcoming
In: CVPR'24, Forthcoming.
@proceedings{neat,
title = {NEAT: Distilling 3D Wireframes from Neural Attraction Fields},
author = {Nan Xue and Bin Tan and Yuxi Xiao and Liang Dong and Gui-Song Xia and Tianfu Wu and Yujun Shen},
year = {2024},
date = {2024-06-18},
urldate = {2024-06-18},
abstract = {This paper studies the problem of structured 3D reconstruction using wireframes that consist of line segments and junctions, focusing on the computation of structured boundary geometries of scenes. Instead of leveraging matching-based solutions from 2D wireframes (or line segments) for 3D wireframe reconstruction as done in prior arts, we present NEAT, a textbf{rendering-distilling} formulation using neural fields to represent 3D line segments with 2D observations, and bipartite matching for perceiving and distilling of a sparse set of 3D global junctions. The proposed {NEAT} enjoys the joint optimization of the neural fields and the global junctions from scratch, using view-dependent 2D observations without precomputed cross-view feature matching.
Comprehensive experiments on the DTU and BlendedMVS datasets demonstrate our NEAT's superiority over state-of-the-art alternatives for 3D wireframe reconstruction. Moreover, the distilled 3D global junctions by NEAT, are a better initialization than SfM points, for the recently-emerged 3D Gaussian Splatting for high-fidelity novel view synthesis using about 20 times fewer initial 3D points.},
howpublished = {In: CVPR'24},
keywords = {},
pubstate = {forthcoming},
tppubtype = {proceedings}
}
Comprehensive experiments on the DTU and BlendedMVS datasets demonstrate our NEAT's superiority over state-of-the-art alternatives for 3D wireframe reconstruction. Moreover, the distilled 3D global junctions by NEAT, are a better initialization than SfM points, for the recently-emerged 3D Gaussian Splatting for high-fidelity novel view synthesis using about 20 times fewer initial 3D points.
2023
Xue, Nan; Wu, Tianfu; Bai, Song; Wang, Fu-Dong; Xia, Gui-Song; Zhang, Liangpei; Torr, Philip H. S.
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning Journal Article
In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 45, no. 12, pp. 14727-14744, 2023.
@article{nokey,
title = {Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning},
author = {Nan Xue and Tianfu Wu and Song Bai and Fu-Dong Wang and Gui-Song Xia and Liangpei Zhang and Philip H.S. Torr},
url = {https://arxiv.org/abs/2210.12971},
doi = {10.1109/TPAMI.2023.3312749},
year = {2023},
date = {2023-12-01},
urldate = {2023-03-14},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
volume = {45},
number = {12},
pages = {14727-14744},
abstract = {This paper presents Holistically-Attracted Wireframe Parsing (HAWP) for 2D images using both fully supervised and self-supervised learning paradigms. At the core is a parsimonious representation that encodes a line segment using a closed-form 4D geometric vector, which enables lifting line segments in wireframe to an end-to-end trainable holistic attraction field that has built-in geometry-awareness, context-awareness and robustness. The proposed HAWP consists of three components: generating line segment and end-point proposal, binding line segment and end-point, and end-point-decoupled lines-of-interest verification. For self-supervised learning, a simulation-to-reality pipeline is exploited in which a HAWP is first trained using synthetic data and then used to ``annotate" wireframes in real images with Homographic Adaptation. With the self-supervised annotations, a HAWP model for real images is trained from scratch. In experiments, the proposed HAWP achieves state-of-the-art performance in both the Wireframe dataset and the YorkUrban dataset in fully-supervised learning. It also demonstrates a significantly better repeatability score than prior arts with much more efficient training in self-supervised learning. Furthermore, the self-supervised HAWP shows great potential for general wireframe parsing without onerous wireframe labels.},
howpublished = {arXiv preprint},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Xue, Nan; Tan, Bin; Xiao, Yuxi; Dong, Liang; Xia, Gui-Song; Wu, Tianfu
Volumetric Wireframe Parsing from Neural Attraction Fields Online
2023, visited: 21.07.2023.
@online{NEAT,
title = {Volumetric Wireframe Parsing from Neural Attraction Fields},
author = {Nan Xue and Bin Tan and Yuxi Xiao and Liang Dong and Gui-Song Xia and Tianfu Wu},
url = {https://arxiv.org/abs/2307.10206},
year = {2023},
date = {2023-07-21},
urldate = {2023-07-21},
abstract = {The primal sketch is a fundamental representation in Marr's vision theory, which allows for parsimonious image-level processing from 2D to 2.5D perception. This paper takes a further step by computing 3D primal sketch of wireframes from a set of images with known camera poses, in which we take the 2D wireframes in multi-view images as the basis to compute 3D wireframes in a volumetric rendering formulation. In our method, we first propose a NEural Attraction (NEAT) Fields that parameterizes the 3D line segments with coordinate Multi-Layer Perceptrons (MLPs), enabling us to learn the 3D line segments from 2D observation without incurring any explicit feature correspondences across views. We then present a novel Global Junction Perceiving (GJP) module to perceive meaningful 3D junctions from the NEAT Fields of 3D line segments by optimizing a randomly initialized high-dimensional latent array and a lightweight decoding MLP. Benefitting from our explicit modeling of 3D junctions, we finally compute the primal sketch of 3D wireframes by attracting the queried 3D line segments to the 3D junctions, significantly simplifying the computation paradigm of 3D wireframe parsing. In experiments, we evaluate our approach on the DTU and BlendedMVS datasets with promising performance obtained. As far as we know, our method is the first approach to achieve high-fidelity 3D wireframe parsing without requiring explicit matching.},
howpublished = {arXiv preprint},
keywords = {},
pubstate = {published},
tppubtype = {online}
}
2020
Xue, Nan; Wu, Tianfu; Bai, Song; Wang, Fudong; Xia, Gui-Song; Zhang, Liangpei; Torr, Philip H. S.
Holistically-Attracted Wireframe Parsing Proceedings Article
In: IEEE Conference on Computer Vision and Pattern Recognition (CVRP), 2020., 2020.
@inproceedings{HAWP,
title = {Holistically-Attracted Wireframe Parsing},
author = {Nan Xue and Tianfu Wu and Song Bai and Fudong Wang and Gui-Song Xia and Liangpei Zhang and Philip H.S. Torr},
year = {2020},
date = {2020-02-23},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVRP), 2020.},
abstract = {This paper presents a fast and parsimonious parsing method to accurately and robustly detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification.
For computing line segment proposals, a novel exact dual representation is proposed which exploits a parsimonious geometric reparameterization for line segments and forms a holistic 4-dimensional attraction field map for an input image. Junctions can be treated as the ``basins" in the attraction field. The proposed method is thus called Holistically-Attracted Wireframe Parser (HAWP). In experiments, the proposed method is tested on two benchmarks, the Wireframe dataset and the YorkUrban dataset. On both benchmarks, it obtains state-of-the-art performance in terms of accuracy and efficiency. For example, on the Wireframe dataset, compared to the previous state-of-the-art method L-CNN, it improves the challenging mean structural average precision (msAP) by a large margin ($2.8%$ absolute improvements), and achieves 29.5 FPS on a single GPU (89% relative improvement). A systematic ablation study is performed to further justify the proposed method. },
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
For computing line segment proposals, a novel exact dual representation is proposed which exploits a parsimonious geometric reparameterization for line segments and forms a holistic 4-dimensional attraction field map for an input image. Junctions can be treated as the ``basins" in the attraction field. The proposed method is thus called Holistically-Attracted Wireframe Parser (HAWP). In experiments, the proposed method is tested on two benchmarks, the Wireframe dataset and the YorkUrban dataset. On both benchmarks, it obtains state-of-the-art performance in terms of accuracy and efficiency. For example, on the Wireframe dataset, compared to the previous state-of-the-art method L-CNN, it improves the challenging mean structural average precision (msAP) by a large margin ($2.8%$ absolute improvements), and achieves 29.5 FPS on a single GPU (89% relative improvement). A systematic ablation study is performed to further justify the proposed method.