Michał is available for hire
Hire Michał

Michał Bieroński

Verified Expert  in Engineering


Kraków, Poland
Toptal Member Since
January 21, 2022

michaov在数据科学领域拥有近9年的专业经验, machine learning, 以及软件开发. He has a computer science background and can fit in data scientist and machine learning engineer roles. 迈克尔已经解决了对话式人工智能中的多个问题, NLP, computer vision, 时间序列预测, social media analysis, 从图结构数据中学习, supply chain analysis, data visualization, 以及生产部署.


Snowflake, MLflow, Apache Airflow, Apache Spark, 亚马逊网络服务(AWS)...
Python, Scikit-learn, Azure, Azure机器学习,Azure数据工厂...
BERT, Streamlit, Dash, Azure, Scikit-learn, GitLab, GitLab CI/CD...




Preferred Environment

Linux, PyCharm, Git, Zsh

The most amazing...

...我开发的是Matchmaking Simulator, an AI-powered engine that led to a new player D30 retention lift of 33% and a D30 revenue lift of 11%.

Work Experience

Senior Data Scientist

2021 - PRESENT
  • Built the Matchmaking Simulator engine by building an AI player model (combining multiple behavioral ML models) and replicating the production matchmaking engine in Python to allow for fast experimentation. 这台发动机仍在使用.
  • Improved the matchmaking engine and developed new algorithms using the simulator, D30留存率提升33%,D30收益提升11%. Led inter-team efforts to deploy, monitor impact, and scale developed features.
  • Built a conversion ML model and integrated it as a component of the Matchmaking Simulator, allowing for experimentation on how the company can drive the conversion rate.
  • 使用AWS操作客户流失预测模型, Docker, Kubernetes, GitHub Actions, MLflow, Airflow, and Snowflake.
  • 研究行业标准解决方案, 构建仿真框架, 选择最可行的解决方案, 并实现了生产就绪的代码, 允许多人游戏的有效评级更新.
  • Owned matchmaking, offering advice and answering questions of leadership and business stakeholders. 领导实验和分析配对工作. 与工程,分析和产品部门合作.
  • 计划和监督一个初级团队成员的工作.
技术:雪花, MLflow, Apache Airflow, Apache Spark, 亚马逊网络服务(AWS), Java, Groovy, Python, Scikit-learn, XGBoost, Streamlit, Amazon S3 (AWS S3), Amazon EC2, Amazon弹性容器服务(Amazon ECS), Amazon SageMaker, Algorithms, Machine Learning, Data Science, 人工智能(AI), SQL, NumPy, Pandas, DVC, Pytest, 持续集成(CI)


2020 - 2021
  • Designed and built the cloud architecture for the data analysis platform for development and production in the Azure cloud.
  • 使用ADF等平台和工具, Azure Data Lake, Azure Databricks, Azure DevOps, Azure Key Vault, Delta Lake, MLflow, Azure机器学习, Azure Kubernetes服务, Azure Synapse Analytics, and Power BI.
  • Preprocessed, cleaned, and identified outliers in the training data for further examination with the business. 使用的一些示例技术是DBSCAN和隔离林.
  • Built the DNN regression model for the forecasting cost of oil drilling-related activities based on historical data. 使用PyTorch和PyTorch闪电.
  • Conducted experiments and statistical tests like analysis of variance (ANOVA) and applied AI solutions.
  • Managed cloud automation of the labor and overhead of the cost forecasting process, 以前由分析师在Excel表格中手动完成.
  • 管理Databricks云数据分析平台. 使用PySpark构建和优化现有的ETL管道. Introduced production monitoring with Azure application insights into the project.
  • 维护和发展时间序列预测库. 管理生产部门提交季度成本预测, decreasing the running time of the quarterly forecast 14 times utilizing multiprocessing.
  • 使用FastAPI构建和部署web API, Docker, and Azure Web App to implement time series models to make them accessible for non-data scientists within the company. 构建Power BI报告,展示API的使用情况.
  • Created the auto-deployment pipeline for the newest version of the internal 时间序列预测 library into the Spark cluster. 在Azure DevOps上构建CI管道, 运行代码质量检查和单元测试,并对团队进行培训.
Technologies: Python, Scikit-learn, Azure, Azure机器学习,Azure数据工厂, Azure Data Lake, Databricks, Azure Key Vault, Delta Lake, MLflow, Azure Kubernetes服务(AKS), Azure容器实例, Azure Functions, Azure Synapse, Microsoft Power BI, PySpark, Spark, 机器学习操作(MLOps), Azure DevOps, ETL, Web Applications, FastAPI, Time Series Analysis, Azure应用程序洞察, Docker, Azure容器注册表, PyTorch, DNN, Pytest, Plotly, Statistics, SQL Server 2017, SQL, Apache Spark, Containerization, DevOps, NumPy, Pandas, Neural Networks, Time Series, Algorithms, Machine Learning, Deep Learning

Senior Data Scientist

2020 - 2021
  • Handled marketing using Dash technology and developed and deployed a business intelligence dashboard for the FMCG industry recommendation system to help marketing teams better understand and target their customers.
  • 管理银行供应链分析. 给定内部银行交易数据, I developed a solution for analyzing the impact of the default of some business entities on other businesses.
  • Led a PoC project for an academic institution that turned into a long-term engagement, aiming to answer questions in the natural language using a knowledge graph. The solution utilized built intent classification and named entity recognition (NER).
  • 领导一个非结构化数据洞察项目. Its goal was to get insights about unstructured raw text data using techniques models like topic modeling (LDA) and sentiment analysis (BERT).
  • 为数据科学行业的职位进行面试, 比如数据分析师, data scientists, and data engineers.
Technologies: BERT, Streamlit, Dash, Azure, Scikit-learn, GitLab, GitLab CI/CD, Microsoft PowerPoint, SpaCy, NetworkX, Docker, Docker Compose, Code Review, Plotly, Data Science, 人工智能(AI), DevOps, Containerization, Machine Learning, Deep Learning, PyTorch, SQL, NoSQL


2019 - 2020
  • 研究并开发了最先进的自然语言处理系统, 会话引擎, and image representation for product recommendation solutions for chatbots.
  • 分布式GPU深度学习模型训练大容量数据.
  • 处理核心产品的全栈开发和维护. Play Framework的后端使用Scala, Spring Java, React TypeScript的前端, and deployment of machine learning models with gRPC microservices like Python and Scala.
  • 领导了一个名为“密码狗”的外部创业公司的子项目. Built core application features like encryption, private key backup, and chat with Java and gRPC.
Technologies: Python, gRPC, Docker, Kubernetes, 自然语言处理(NLP), 生成预训练变压器(GPT), GPT, Chatbots, Scala, Java, Spring, Play Framework, React, TypeScript, NumPy, Pandas, Machine Learning, Deep Learning, 亚马逊网络服务(AWS), NoSQL, PyTorch

Software Developer

2016 - 2018
  • Maintained and developed various internal projects with project-dependent tech stacks like Python, Django, Angular, Vue.js, and Backbone.Js,以及服务器、客户端和端到端测试的复杂测试.
  • Introduced Docker into the team and developed a method for e2e testing without mocking the server-side with the dockerized environment.
  • 管理服务器端管理,如Nginx, repository hooks, automating builds, and CI, Docker Registry, Sentry, and Celery jobs.
Technologies: Python, Django, Django REST Framework, Angular, Vue, Backbone.js, E2E Testing, Docker, Jenkins, CI/CD Pipelines, Sentry, PostgreSQL, Celery, Bash, Linux, JavaScript, TypeScript, 测试驱动开发(TDD), Scrum, DevOps, Containerization

Java Summer Trainee

2016 - 2016
  • Developed a CI plugin automating the connection of C++ compilation errors with the person responsible for breaking the code via the version control system.
  • 在Jenkins CI/CD和产品维护中部署插件.
  • Developed the custom IDE based on the IntelliJ platform for programming language TTCN-3.
技术:Java, Jenkins, Bash, IntelliJ IDEA

Meme Learning

基于Meme2vec方法的模因推荐系统, 基于社交的模因矢量化, visual, 文本嵌入以及基于文本的模因分类模型.

The model was further used to create a Slack and Discord bot that chooses the best matching meme template for a given text, 然后创建一个表情包并将其发送给用户. The project won an internal university poster session for data science projects and a poster session at the Polish Alliance for the Development of 人工智能 (PP-RAI) conference.


The project combined classic image processing methods for license plate and character segmentation and a custom deep learning module for character classification trained on the self-built dataset.

Its goal was to detect and recognize license plate characters in different lighting conditions with comprehensive evaluation and comparison to other available solutions.
2018 - 2019



2014 - 2018













Pandas, PySpark, PyTorch, NumPy, Beautiful Soup, React, Vue, Backbone.js, Scikit-learn, SpaCy, NetworkX, Keras, TensorFlow, OpenCV, Matplotlib, XGBoost


PyCharm, Git, Zsh, Docker Compose, Azure机器学习, Jenkins, IntelliJ IDEA, Sentry, Celery, GitLab, GitLab CI/CD, Microsoft PowerPoint, Azure Key Vault, Azure Kubernetes服务(AKS), Microsoft Power BI, Azure应用程序洞察, Pytest, Seaborn, Plotly, Apache Airflow, Amazon弹性容器服务(Amazon ECS), Amazon SageMaker


Python, Scala, Java, TypeScript, Bash, TTCN, JavaScript, SQL, Snowflake, Groovy


Apache Spark, Spark, gRPC, Spring, Play Framework, Django, Django REST Framework, Angular, Streamlit, Scrapy


Data Science, DevOps, Azure DevOps, E2E Testing, ETL, 测试驱动开发(TDD), Scrum, 持续集成(CI)


PostgreSQL, SQL Server 2017, NoSQL, Amazon S3 (AWS S3)


Linux, Docker, Databricks, Azure, Kubernetes, Azure Functions, Azure Synapse, DNN, 亚马逊网络服务(AWS), Amazon EC2


Machine Learning, 人工智能(AI), 机器学习操作(MLOps), 自然语言处理(NLP), GPT, 生成预训练变压器(GPT), Chatbots, MLflow, CI/CD Pipelines, BERT, Dash, Code Review, Azure Data Factory, Azure Data Lake, Delta Lake, Azure容器实例, FastAPI, Time Series Analysis, Azure容器注册表, Statistics, Computer Networking, Data Scraping, Scraping, Web Scraping, Containerization, Neural Networks, Time Series, Algorithms, Deep Learning, Web Applications, DVC



Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.


与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.


Start hiring