We contribute a multimodal dialogue dataset, named Deep Personalized Character Dataset (DPCD), from TV shows, which contains a large number of character-specific text, audio and video dialogue data with ~10k utterances and ~6 hours audio and video per character.