mmsa A text-image bimodal sentiment analysis experiment on the Yelp restaurant review dataset. My model architecture is as follows: