wprazuch / e-resume

My professional experience and history in one place

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Machine Learning passionate, currently working as a Senior Machine Learning Engineer, with both academic and professional experience in implementing intelligent systems. Always striving for development, dedicated in the work, focused on results, and constantly looking for opportunities to prove myself useful. My current areas of expertise are two-fold:

  • business oriented, where my focus lies between retail, big-scale Machine learning systems, as well as intelligent medical applications;
  • domain oriented, where I work best with MLOps, tabular data, and Computer Vision systems;

My professional experience started as a part-time Software Developer for .NET framework as additional income during my Bachelor and MSc studies. I was working with Point of Sale (PoS) applications for large retail fashion chains, where customer service application was built to handle loyalty systems, discounts and transactions. During this time, I had occassion to work with great Software Architects and Software Engineers and gained good knowledge about rituals in IT as well as design patterns and big IT systems.

My adventure with Machine Learning started on my MSc studies, where I started working with mass spectrometry data to create automated heterogeneity assessment system for Head & Neck cancer tumours. Next, I joined the PhD program to develop a system for early lung cancer detection from low-dose CT images.

Social media

I publish from time to time some news about my area of expertise. I have also provided some short articles from area of Data Science and Mathematics.

https://medium.com/@wprazuch

https://twitter.com/PrazuchWojciech

Work experience

Oct 2019 - May 2023 - PhD Candidate at Silesian University of Technology. My PhD thesis was focused on implementing automated system of malignant lesion in low-dose Computed Tomography technique. Unfortunately, my studies have been interrupted due to personal matters.

Delivered Projects

Real Estate Agent Segmentation

The project was conducted for one of the biggest real estate companies in the United States, present also across the globe. From the long-time experience, the client recognized different agent sales patterns. During the project, we built an agent segmentation system, which not only recognized different agent groups in an unsupervised manner, but also provided automatic explainability of the clusters, provided to the client. The client could then analyze the most important characteristics in each cluster and then name the agent group in a jargon manner. This system provided a baseline for different recommendations for each agent based on their professional strategies, such as additional training programs or networking possiblities.

Marketing Campaigns Optimization

For the client's company, the goal was to increase the activation rate of churning customers on the platform. Specific rules defined different customer churn segments. For each customer at those groups, the goal was to incentivize the customer by sending the voucher. The value of the voucher ranged, and depending on customer's profile, the goal was to send such voucher which would be attractive enough to make an order. The system aimed to balance between the cost reduction for the company and activation maximization under a specific budget.

Customer Lifecycle Segmentation

For one of the biggest delivery platforms around the world, I developed an intelligent system of client incentivization based on sales patterns in different countries. The functionality causes a net profit of 14 million euros each year for the company.

Next Best Action

The holy grail of retail companies, where the system itself decides on best strategy to target customers. One of the metrics we used is the net increase. I put the foundation on the system, and the project is still being developed.

Circa

The system was developed thanks to rich, diverse dataset of X-ray images of patients from hospitals across Poland. The goal was to identify pneumona in the images caused by COVID-19 disease specifically. We applied various techniques for bias reduction and model explainability.

Astral

This project was performed in collaboration with Universitatsklinikum in Essen, Germany, where I developed an application in Airflow for analysis and reporting of microscopis image sequences for astrocyte communication in mice brain.

Covrad

The goal of the porject was to assess the severity of Covid-19 pneumonia patients in CT scans in terms of different pneumonial patterns and side effects of the disease inside lung parenchyma.

Brain Motion Correction

This project was done in collaboration with Henry Ford Hospital in Detroit, where motion correction algorithms were applied for MRI scans of mice brains being targeted with various hemorrhage treatments.

Mass Spectrometry Imaging

The goal of the project was to assess the heterogeneity of the cancer tissue dissected from Head and Neck cancer patients. An algorithm, based on unsupervised, hierarchical clustering, based on K-Means with automatic k tuning, was created. The system produced the analytics about the heterogeneity of the tissue, thus giving information about possible difficulty of cancer treatment.

Main Skills

  • Ability to conduct research, read research papers
  • Python
  • Machine Learning
  • Software Engineering

Skills

Programming Languages

Python - long-term experience, I have been developing in Python since 2017. Applications developed ranged from complex Machine Learning and Data Science pipelines, through scripting and automation scripts, to backend systems in FastAPI, Flask, and Django. multiprocessing programming, joblib. SQL - long-term experience, started in 2016. I used SQL in OLTP such as SQL Server, MySQL, PostgreSQL and also OLAP such as BigQuery. Java - Where I developed backend solutions for Java EE and some demo applications in Spring and Spring Boot. C++ - Where I implemented many common algorithms and data structures from scratch, such as Tetris, Chess games, or some computer graphics transformations and methods. C# - Where I was working professionally for 3 years, and implemented many plug-in applications for Point-of-Sale systems, such as loyalty cards, payment methods, and inventory management views. Kotlin - Where I had occassion to translate some common Machine Learning operations and models for mobile solutions. For one of the projects, I translated and converted the SOTA OCR system for mobile application, where both performance and low computational complexity are most important.

Python-oriented Stack

Python Data Science Stack - NumPy, Pandas, sciPy, scikit-learn, XGBoost, statsmodels, matplotlib, seaborn, plotly, pandas-gbq, Dask, venv, Jupyter Notebook, Numba

Computer Vision - scikit-image, openCV, Pillow

Machine Learning Stack - scikit-learn, Tensorflow, Keras, PyTorch, huggingface, Onnx

MLOps tools - Polyaxon, MLFlow, streamlit, Dash, KubeFlow, DVC, Airflow, Luigi, poetry, Atlantis, Github Actions, CI/CD pipelines, Terraform, Docker, Kubernetes, Tensorboard,

Medical Toolkit - PyNrrd, SimpleITK, PyDicom, Nibabel

ML-focused Design Patterns -

Backend stack - FastAPI, Flask, Django

Standard Development Toolkit - Git, Zsh, JIRA, Confluence, Slack,

Google Cloud Platform (GCP) - BigQuery, Vertex AI, Google Cloud Storage (GCS), Google Container Registry (GCR)

Amazon Web Services (AWS) - AWS Lambda, EC2, Elastic Container Registry (ECR), S3, Elastic Container Service (ECS), AWS SageMaker

Soft skills

At university, I conducted many seminars aimed to describe the scientific work I was doing. Some of my seminars topics included: clustering algorithms for mass spectrometry imaging, visualization techniques for high-dimensional data, image processing pipelines for microscope array data, lung cancer domain problems, Machine Learning systems and challenges for lung cancer screening and identification.

At work, I conducted many seminars where I showed both practical and theoretical subjects from Machine Learning domain. Some of them included scientific paper reviews, MLOps tools live demos, or

Moreover, I had occassion to conduct live presentations where I described common breakthrough achievements of AI, such as DeepFakes, in layman's terms.

Work experience

Sep 2016 - Oct 2019 - Software Developer in Diebold-Nixdorf Implementing solutions for Point of Sale (PoS)

Oct 2019 - May 2023 - PhD Candidate at Silesian University of Technology (SUT)

Dec 2020 - June 2022 - Machine Learning Engineer at Netguru

July 2022 - present - Senior Machine Learning Engineer at Netguru

Professional Activities

My educational experience focused on many courses at Silesian University of Technology

  • Fundamentals of Computer Programming - where I taught freshmen about the basics of C language, such as conditional statements, for loops, functions, structures, algorithms and data structures.
  • Optimization and Decision Making
  • Probability and Statistics
  • Bioinformatics and Biostatistics
  • Optimization Methods
  • Statistical Learning
  • Deep Learning in Data Science - I created the laboratory syllabus and contents together with my colleagues. Some of the lab topics are: Introduction to Jupyter Notebooks and Tensorflow, Feed-forward networks, Convolutional Neural Networks, Convolutional Neural Networks 2, Generative Adversarial Networks, Recurrent Neural Networks, Style Transfer, Hyperparameter Tuning, Transfer Learning, Multi-task Learning

Certifications

AWS Certified Solutions Architect - AssociateAWS Certified Solutions Architect - Associate Amazon Web Services (AWS)Amazon Web Services (AWS) Wydany maj 2022 · Wygasa maj 2025Wydany maj 2022 · Wygasa maj 2025 Identyfikator poświadczenia LG7NC4YBNFREQ85S

Certificate in Advanced English (CAE) - Cambridge English Level 3 Certificate in ESOL International

Research

Prazuch, W. et al (2022). Radiomic-Based Lung Nodule Classification in Low-Dose Computed Tomography. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13346. Springer, Cham. https://doi.org/10.1007/978-3-031-07704-3_29

Binczyk, F. Prazuch, W. et al. Radiomics and artificial intelligence in lung cancer screening - DOI: 10.21037/tlcr-20-708, Translational Lung Cancer Research, IF 6.498

Dzyubenko, E. Prazuch, W. et al. Analysing intercellular communication in astrocytic networks using "Astral" - DOI: 10.3389/fncel.2021.689268, Frontiers in Cellular Neuroscience, IF 5.505

RACOPM 2021: Marek Socha, Aleksandra Suwalska, Wojciech Prazuch, Michał Marczyk, Joanna Polanska, and POLCOVID Study Group "UMAP-based graphic representation of POLCOVID chest X-Ray data set heterogeneity" in Recent Advances in Computational Oncology and Personalised Medicine (Eds. Ziemowit Ostrowski and Sylwia Bajkacz), 2021, vol.1, pp.100-114, ISBN 978-83-7880-800-8

CIRCA: comprehensible online system in support of chest X-rays-based COVID-19 diagnosis. Wojciech Prazuch, Aleksandra Suwalska, Marek Socha, Joanna Tobiasz et al. https://arxiv.org/abs/2210.05440

POLCOVID: a multicenter multiclass chest X-ray database (Poland, 2020-2021). Aleksandra Suwalska, Joanna Tobiasz, Wojciech Prazuch, Marek Socha et al. https://arxiv.org/abs/2211.16359

nUMAP: Neural Network Based UMAP Solution for the multi dataset visualisation. Aleksandra Suwalska, Marek Socha, Wojciech Prazuch et al. https://delibra.bg.polsl.pl/Content/75754/BCPS-85106_2022_nUMAP-neural-networ_0000.pdf

UMAP-based graphic representation of POLCOVID chest X-ray data set heterogeneity. Marek Socha, Aleksandra Suwalska, Wojciech Prazuch et al. http://delibra.bg.polsl.pl/Content/74386/BCPS-83568_2021_UMAP-based-graphic-r_0000.pdf

Blog Posts

Machine Learning Tools Comparison link

Data Science vs. Business Intelligence: What’s the Difference? link

10 Examples of Data Science in Marketing link

11 Examples of Data Science in Finance link

14 Key Trends in Data Science for 2021 link

How the Internet of Things Is Changing Retail link

6 Phases of the Data Science Project Life Cycle link

5 Examples of How Machine Learning Supports Business Growth link

The 16 Top Cross-Industry IoT Analytics Applications link

Examples of How Big Brands Are Using Advanced Analytics link

Solving static optimization problem using modified Kuhn-Tucker conditions link

How to easily improve model’s performance with standardization link

Open-source Contributions

google/gin-config - Where we introduced pickled configs

Personal Projects

Mail & SMS Spam Detection link - In this project I built a deep learning model to detect spam from sms and e-mail messages. Additionally, I provided some EDA for the data. I have also created a simple endpoint in Flask to communicate with the model.

Face Mask Detection link - In this project, I created a face mask detection system that checks, whether a person wears a face mask or not. Again, you can communicate with the model using REST API written in Flask - an example client is shown in the repo.

Automatic Stack Exchange Tagging link - The intention of the project was to create an automated way of tagging questions raised by users on Stack Exchange.

Face recognition link - The goal of the project was to recognize faces given in the photo. A pretrained face recognition model was used to generate embeddings for each face provided.

IMDb Movie Reviews link - This project showed how to build a model for a simple Sentiment Analysis task for IMDb Movie Reviews data.

Deep Learning from Scratch link - In this project, I started from scratch in coding some simple neural networks and applying backpropagation to train them.

Stock Price Prediction link - Common deep learning architectures to predict stock price from real-world data.

Music Genre Classification link - Predicting music genre using ML algorithms. Some EDA was provied as well.

Twitter Sentiment Analysis link - Classification of Tweet sentiments together with EDA.

About

My professional experience and history in one place