In this paper, we propose a method for facial expression recognition for in-the-wild videos. Our method combines Deep Residual Network (ResNet) and Bidirectional Recurrent Neutral Network with Long-Short-Term Memory Unit (BLSTM). This method won the 2 nd place in the seven basic expression classification track of Affective Behavior Analysis in-the-wild Competition held in conjunction with the IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2020, achieving 66.9% accuracy and 40.8% final metric on the test set. We also visualize the learned attention maps and analyze the importance of different regions in facial expression recognition.