A deep neural network for the electromagnetic-thermal co-simulation of microstrip patch antenna arrays is proposed based on convolutional encoder–decoder (CED) architecture. The proposed network comprises two sub-networks, the gain pattern network and the temperature network, which take the antenna array structure and frequency as input, and output the gain pattern and temperature distribution respectively. In order to improve the performance of temperature network, a novel methodology named multiphysics chain-of-thought (MPCoT) is introduced by adding physical reasoning processes similar to real physical processes. Training the network utilizes datasets generated via the finite element method (FEM) from various array configurations (2 × 2, 3 × 3, 4 × 4, and 5 × 5), and an off-line training pipeline is designed for the proposed network. Leveraging robust approximation capabilities, the proposed network ensures both accuracy and a significant reduction in CPU time and memory consumption compared to conventional numerical methods when solving electromagnetic-thermal problems.