tlikhomanenko/tlikhomanenko

Dr. Tatiana Likhomanenko

Research scientist and software developer.
Semi-supervised and unsupervised learning, speech recognition.
Gravitating to core ML, video processing, and private federated learning.

Industry and Research Experience

Apple, Staff Research Scientist (Oct 2023 - present)
Apple, Senior Research Scientist (Sep 2021 - Oct 2023)
Fundamental AI Research, Postdoctoral Researcher (Aug 2019 - Aug 2021)
Speech recognition and natural language processing for speech
Advisors: Ronan Collobert, Gabriel Synnaeve
Fundamental AI Research, AI Resident (Sep 2018 - Aug 2019)
Speech recognition and natural language processing for speech
Advisors: Ronan Collobert, Gabriel Synnaeve
NTechLab, Machine Learning Expert (Aug 2017 - Sep 2018)
Face recognition and facial attributes predictions with deep learning at top-1 face recognition team
Yandex & CERN, Researcher (Apr 2013 - May 2017)
Machine learning for High Energy Physics studies at the Large Hadron Collider: particle identification system, trigger system (online identification which collisions worth being stored), specific rare decays search (high-level data analysis), and B mesons oscillations (main subject of the LHCb studies)
Membership at Large Hadron Collider beauty (LHCb) collaboration, CERN (2013 - 2018)

Education

Ph.D. in Computer Science, Lomonosov Moscow State University (2017)
Faculty of Computational Mathematics and Cybernetics
Advisor: Eugene Moiseev
Thesis: Research on solutions of non-classical boundary-value problems for mixed type equations
M.S. in Computer Science, Yandex School of Data Analysis, 5.0/5.0 (2014)
M.S. in Computer Science, Lomonosov Moscow State University, 5.0/5.0 (2013)
Faculty of Computational Mathematics and Cybernetics
Summer School on Bayesian Methods in Deep Learning (2017)
Rome-Moscow School of Matrix Methods and Applied Linear Algebra (2012, 2013)

Software

mlx-data: framework agnostic data loading library brought to you by Apple machine learning research; it works with PyTorch, Jax or MLX
Flashlight: a fast, flexible machine learning library written entirely in C++
blog post
Wav2letter++: speech recognition toolkit and recipes for papers
BDT reweigter tutorial
HepML: specific machine learning tools for purposes of high energy physics
REP: ipython-based environment for conducting data-driven research in a consistent and reproducible way

Public Talks

Simple and Efficient Self-Training Approaches for Speech Recognition, Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III), NeurIPS, New Orleans (2023)
Simple and Efficient Pseudo-Labeling for Speech Recognition, On-Device Workshop MLSys, Miami (2023)
Machine Learning at Apple, WiML@ICML, Baltimore (2022)
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings, ReWork Deep Learning Summit, San Francisco (2022)
Positional Embedding in Transformer-based Models, Higher School of Economics (2021)
slimIPL: Language-Model-Free Iterative Pseudo-Labeling, NTR Lab and Tomsk University (2021, in Russian)
Pseudo-labeling for speech recognition, NTR Lab and Tomsk University (2021, in Russian)
Machine learning in Science and Industry, Heidelberg University (2017)
LHCb topological trigger optimization, Data&Science: Large Hadron Collider, public series, Yandex, Moscow (2016)
Classifier output calibration to probability, Heavy Flavour Data Mining workshop, Zurich University (2016)
Machine Learning and Optimization of LHC Real-Time Event Stream Filter for New Physics Discoveries, Machine Learning: Prospects and Applications Conference, Berlin (2015)

Selected Publications

Private Federated Learning

Pelikan*, M., Azam, S.S., Feldman, V., Silovsky, J., Talwar, K. and Likhomanenko*, T. Federated Learning with Differential Privacy for End-to-End Speech Recognition, 2023. arXiv preprint arXiv:2310.00098. Under review.
Azam*, S.S., Pelikan*, M., Feldman, V., Talwar, K., Silovsky, J. and Likhomanenko*, T. Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR. In International Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023. Oral
overview, video, slides, poster
Azam, S.S., Likhomanenko, T., Pelikan, M. and Silovsky, J. Importance of Smoothness Induced by Optimizers in FL4ASR: Towards Understanding Federated Learning for End-to-End ASR, ASRU 2023.

Machine Learning

Busbridge*, D., Ramapuram*, J., Ablin*, P., Likhomanenko*, T., Dhekane, E.G., Suau, X. and Webb, R. How to Scale Your EMA. Thirty-Seventh Conference on Neural Information Processing Systems (NeurIPS), 2023. Spotlight.
overview, video, slides, poster
Zhai*, S., Likhomanenko*, T., Littwin*, E., Busbridge*, D., Ramapuram*, J., Zhang, Y., Gu, J. and Susskind, J. Stabilizing Transformer Training by Preventing Attention Entropy Collapse. In International Conference on Machine Learning (ICML), 2023.
overview, video, poster, code
Gheini, M., Likhomanenko, T., Sperber, M. and Setiawan, H. Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data. ACL Findings, 2023.
overview
Zhai, S., Jaitly, N., Ramapuram, J., Busbridge, D., Likhomanenko, T., Cheng, J.Y., Talbott, W., Huang, C., Goh, H. and Susskind, J.M. Position Prediction as an Effective Pretraining Strategy. In International Conference on Machine Learning (ICML), 2022, pp. 26010-26027. PMLR. (Spotlight)
overview, video, poster
Kahn, J.D., Pratap, V., Likhomanenko, T., Xu, Q., Hannun, A., Cai, J., Tomasello, P., Lee, A., Grave, E., Avidov, G., Steiner, B., Liptchinsky, V., Synnaeve, G., Collobert, R. Flashlight: Enabling Innovation in Tools for Machine Learning. In International Conference on Machine Learning (ICML), 2022, pp. 10557-10574. PMLR. (Spotlight)
video, presentation, poster, code
Likhomanenko, T., Xu, Q., Synnaeve, G., Collobert, R. and Rogozhnikov, A. CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings. Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS), 2021.
openreview, video, presentation, code
Rogozhnikov, A., Likhomanenko, T. InfiniteBoost: building infinite ensembles with gradient descent. arXiv preprint arXiv:1706.01109. 2017.

Automatic Speech Recognition

2023

Rouditchenko, A., Collobert, R. and Likhomanenko, T., AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition, 2023. arXiv preprint arXiv:2309.17395. Under review.
Likhomanenko, T., Lugosch, L. and Collobert, R. Unsupervised ASR via Cross-Lingual Pseudo-Labeling, 2023. arXiv preprint arXiv:2305.13330. Under review.
Berrebbi, D., Collobert, R., Jaitly, N., Likhomanenko, T. More Speaking or More Speakers?. ICASSP 2023.
overview
Berrebbi, D., Collobert, R., Bengio, S., Jaitly, N., Likhomanenko, T. Continuous Pseudo-Labeling from the Start. ICLR 2023.
overview, video, slides, poster

2022

Likhomanenko, T., Collobert, R., Jaitly, N., Bengio, S. Continuous Soft Pseudo-Labeling in ASR. I Can’t Believe It’s Not Better Workshop at NeurIPS 2022.
video, poster
Lugosch, L., Likhomanenko, T., Synnaeve, G. and Collobert, R. Pseudo-Labeling for Massively Multilingual Speech Recognition. ICASSP 2022.
blog post, code
Pratap, V., Xu, Q., Likhomanenko, T., Synnaeve, G. and Collobert, R. Word Order Does Not Matter For Speech Recognition. ICASSP 2022.

2021

Manohar, V., Likhomanenko, T., Xu, Q., Hsu, W.N., Collobert, R., Saraf, Y., Zweig, G. and Mohamed, A., 2021. Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition. ASRU 2021.
Likhomanenko, T., Xu, Q., Kahn, J., Synnaeve, G. and Collobert, R. slimIPL: Language-model-free iterative pseudo-labeling. Interspeech 2021.
video, poster, code
Likhomanenko*, T., Xu*, Q., Pratap*, V., Tomasello, P., Kahn, J., Avidov, G., Collobert, R. and Synnaeve, G. Rethinking evaluation in asr: Are our models robust enough? Interspeech 2021.
video, poster, code
Hsu, W.N., Sriram, A., Baevski, A., Likhomanenko, T., Xu, Q., Pratap, V., Kahn, J., Lee, A., Collobert, R., Synnaeve, G. and Auli, M., 2021. Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training. Interspeech 2021.
Xu, Q., Baevski, A., Likhomanenko, T., Tomasello, P., Conneau, A., Collobert, R., Synnaeve, G. and Auli, M., 2021, June. Self-training and pre-training are complementary for speech recognition. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 3030-3034). IEEE.
video
Talnikar, C., Likhomanenko, T., Collobert, R. and Synnaeve, G., 2021, June. Joint masked cpc and ctc training for asr. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 3045-3049). IEEE.
video, poster, presentation

2020

Xu, Q., Likhomanenko, T., Kahn, J., Hannun, A., Synnaeve, G. and Collobert, R., 2020. Iterative Pseudo-Labeling for Speech Recognition. Proc. Interspeech 2020, pp.1006-1010.
video, code
Pratap, V., Xu, Q., Kahn, J., Avidov, G., Likhomanenko, T., Hannun, A., Liptchinsky, V., Synnaeve, G., Collobert, R. (2020) Scaling Up Online Speech Recognition Using ConvNets. Proc. Interspeech 2020, 3376-3380.
video, blog post, news
Kahn, J., Rivière, M., Zheng, W., Kharitonov, E., Xu, Q., Mazaré, P.E., Karadayi, J., Liptchinsky, V., Collobert, R., Fuegen, C. and Likhomanenko, T., 2020, May. Libri-light: A benchmark for asr with limited or no supervision. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7669-7673). IEEE.
presentation, blog post, code
Synnaeve*, G., Xu*, Q., Kahn*, J., Likhomanenko*, T., Grave*, E., Pratap, V., Sriram, A., Liptchinsky, V. and Collobert, R. End-to-end asr: from supervised to semi-supervised learning with modern architectures. SAS Workshop ICML 2020.
video, code

2019

Likhomanenko, T., Synnaeve, G. and Collobert, R., 2019. Who Needs Words? Lexicon-Free Speech Recognition. Proc. Interspeech 2019, pp.3915-3919.
presentation, blog post, code

Machine Learning in High Energy Physics

Derkach, D., Hushchyn, M., Likhomanenko, T., Rogozhnikov, A., Kazeev, N., Chekalina, V., Neychev, R., Kirillov, S., Ratnikov, F. and LHCb collaboration. Machine-Learning-based global particle-identifiritcation algohms at the LHCb experiment. Journal of Physics: Conference Series. 2018. Vol. 1085. No. 4. P. 1-5.
ACAT 2017, poster
Likhomanenko, T., Derkach, D., Rogozhnikov, A. Inclusive Flavour Tagging Algorithm. Journal of Physics: Conference Series, 2016.
ACAT 2016, poster, code
LHCb collaboration (2016). Search for decays of neutral beauty mesons into four muons, JHEP 03 (2017) 001.
Likhomanenko, T., Ilten, P., Khairullin, E., Rogozhnikov, A., Ustyuzhanin, A., Williams, M. LHCb Topological Trigger Reoptimization. Journal of Physics: Conference Series, 2015.
CHEP 2015, presentation, code
CMS collaboration, LHCb collaboration. Observation of the rare Bs0→ μ+ μ− decay from the combined analysis of CMS and LHCb data. Nature, 2015.
Likhomanenko, T., Rogozhnikov, A., Baranov, A., Khairullin, E., & Ustyuzhanin, A. Reproducible Experiment Platform. Journal of Physics: Conference Series (Vol. 664, No. 5, p. 052022).
CHEP 2015, poster
LHCb collaboration. Search for the lepton flavour violating decay τ−→ μ− μ+ μ−. Journal of High Energy Physics, 2015.
Likhomanenko, T., Rogozhnikov, A., Baranov, A., Khairullin, E., Ustyuzhanin, A. Improving reproducibility of data science experiments, ICML 2015 AutoML Workshop, 2015
poster spotlight

Partial Differential Equations (Ph.D.)

Moiseev, E.I., Likhomanenko, T.N. Eigenfunctions of the Gellerstedt problem with an inclined-type change line. Integral Transforms and Special Functions, 2017, pp. 1–8.
Moiseev E. I., Likhomanenko T. N. On the basis property of a two-part trigonometric series. Doklady Mathematics, 2016, Vol. 94, No. 1, pp. 1–4.
oral talk, International scientific conference Actual Problems in Theory of Partial Differential Equations, dedicated to the centenary of Andrey V. Bitsadze, 2016
Moiseev, E.I., Likhomanenko, T.N. Eigenfunctions of the Tricomi problem with an inclined type change line. Differential Equations, 2016, Vol. 52, No. 10, pp 1323– 1330.
oral talk, International scientific conference Actual Problems in Theory of Partial Differential Equations, dedicated to the centenary of Andrey V. Bitsadze, 2016
Moiseev, E.I., Likhomanenko, T.N. On the basis property of a trigonometric system arising in the Frankl problem. Differential Equations, 2013, Vol. 49, No. 3, pp. 325–331.
oral talk, AMEE-2013 and Lomonosov-2013
Moiseev E.I., Likhomanenko T.N. A nonlocal boundary value problem for the Lavrent’ev-Bitsadze equation. Doklady Mathematics, 2012, Vol. 86, No. 2, pp. 635–637.
oral talk, AMEE-2012 and Lomonosov-2012

Teaching

DeepLearn Autumn School, Self-, Weakly-, Semi-Supervised Learning in Speech Recognition (Oct 2022)
Heidelberg University, Grad Days, Machine learning in Science and Industry, invited lecturer (2017)
lectures
Imperial College London, Introduction to Machine Learning, TA (2016, 2017)
lectures/seminars 2016, lectures/seminars 2017
Yandex School of Data Analysis, Machine learning in High Energy Physics, lecturer (2016)
Lund University, Summer School on Machine Learning in High Energy Physics (MLHEP), program committee & lecturer (2016)
lectures/seminars
Saint Petersburg Academic University, Summer School on Machine Learning in High Energy Physics (MLHEP), organizing committee & lecturer (2015)
lectures/seminars

Research Activities

Serving as Reviewer

Transactions on Machine Learning Research (TMLR)
Journal of Artificial Intelligence Research
NeurIPS 2021, 2022 (top-8% reviewer), 2023 (top-8% reviewer)
ICLR 2021, 2022 (highlighted reviewer), 2023, 2024
ICLR Blogposts 2023, 2024
ICML 2022, 2023
Interspeech 2020, 2021, 2022, 2023 (top-2% reviewer), 2024
ICASSP 2021, 2022, 2023 (outstanding reviewer), 2024
Machine Learning and the Physical Sciences workshop NeurIPS 2019, 2020, 2022, 2023
SynS and ML Workshop ICML 2023
Vision-based InduStrial InspectiON (VISION) Workshop CVPR 2023
CHIME 2023
BayLearn 2022, 2023
An advisor in the LHCb statistics and machine learning working group (2016-2017)

Serving as Area Chair

ICML 2024
NeurIPS 2024
NeurIPS Datasets and Benchmarks 2023, 2024
Vision-based InduStrial InspectiON (VISION) Workshop CVPR 2023

Mentorship

WiML, Research Mentorship, NeurIPS, New Orleans (2023)
LatinX in AI, Mentorship Hour (Panel), ICML, Honolulu (2023)
LatinX in AI, CV Research workshop, CVPR, New Orlean (2022)

Panels

Failure Modes in the Age of Foundation Models, workshop "I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models", NeurIPS, New Orleans (2023)
Mentorship Hour, LatinX in AI, ICML, Honolulu (2023)
On-Device Workshop MLSys, Miami (2023)

Organizer

1st workshop and challenge on Vision-based InduStrial InspectiON, CVPR 2023

Kaggle Competition "Flavours of Physics"

research/technical support
award committee member
co-organizer of ALEPH workshop at NeurIPS 2015
starter-kit for competition

Advising

Zijin Gu, AI/ML Resident, Apple 2023-2024 (co-advising with Navdeep Jaitly)
Andrew Rouditchenko, summer internship, Apple, 2023
Lingxiao Zhao, summer internship, Apple, 2023 (co-advising)
Chun-wei Ho, summer internship, Apple, 2023 (co-advising with Navdeep Jaitly and Ronan Collobert)
Sheikh Shams Azam, AI/ML Resident, Apple 2022-2023 (co-advising with Honza Silovsky)
Dan Berrebbi, summer internship, Apple, 2022
Mozhdeh Gheini, summer internship, Apple, 2022 (co-advising with Matthias Sperber and Hendra Setiawan); Apple, 2023
Colby Bunbary, summer internship, Apple, 2022 (co-advising)
Loren Lugosch: summer internship, Facebook AI Reserch, 2021 (co-advising with Ronan Collobert and Gabriel Synnaeve); summer internship, Apple (co-advising with Ronan Collobert), 2022
Chaitanya Talnikar, AI Residency 2019-2020 (co-advising with Ronan Collobert and Gabriel Synnaeve)

In News

Interview to Republic (in Russian)
Q&A with AI Residents
About paper "Rethinking Evaluation in ASR: Are Our Models Robust Enough?"
About kaggle challenge "Flavours of physics"
About paper "LHCb Topological Trigger Reoptimization"

Honors & Awards

Winner of Accelerate your code international competition, Intel (2012)
Best student of Computer Science faculty, Lomonosov Moscow State University (2012)
The winner (Regional stage) of All-Russian Programming contest (2007, 2008)