Traditional Village Classification Model Based on Transformer Network

Qi Zhong

Abstract


The study of traditional villages holds significant implications in cultural, historical, and societal contexts. Despite the considerable research focus on the architectural styles of Qiang, Tibetan, Han, and Hui ethnic villages due to their distinctiveness, rapidly and accurately identifying the types of traditional villages in practical surveys remains a challenge. To address this issue, this paper establishes an aerial image dataset for Qiang, Tibetan, Han, and Hui ethnic villages and introduces a specialized feature extraction network, Transformer-Village, designed for the classification and detection of traditional villages using deep learning algorithms. The overall structure of the network is lightweight, incorporating condconv dynamic convolution as the core layer structure; furthermore, a spatial self-attention-related feature extraction network is designed based on Transformer. In conclusion, through simulated experiments, Transformer-Village coupled with the YOLO detector achieves a 97.2% mAP on the test set, demonstrating superior detection accuracy compared to other baseline models. Overall, the experimental results suggest that this work is feasible and practical.


Full Text:

PDF


DOI: https://doi.org/10.22158/asir.v7n4p126

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Qi Zhong

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © SCHOLINK INC.   ISSN 2474-4972 (Print)    ISSN 2474-4980 (Online)