Aleksandr Shirokov

Aleksandr Shirokov

Team Lead MLOps / AI Infrustructure

WB

👋 Hi, and welcome!

My name is Aleksandr Shirokov, and I am a T3 Fullstack AI Software Engineer with 5+ years of experience using Python as my main programming language and Team Lead management competence. I specialize in Data and MLOps workflows, with hands-on experience launching AI products from scratch to production, and leading small teams from MVP to scale.

I hold two Master’s degrees: one in Big Data & Machine Learning (ITMO University), and another in Data Science & Engineering (MSU AI Masters). I’m also an alumnus of the Yandex and Tinkoff Backend Academies, and an active participant in machine learning competitions.

Currently, I am leading MLOps Team in marketplace Wildberries (WB) in the RecSys department (more than one-third of purchases in WB come from RecSys!), launching AI products, building ML infrastructure and tools for 300+ ML engineers. I and my team support the full ML lifecycle, from research to production, and work closely with real user-facing products like Search by Photo, directly impacting business metrics.

I serve as one of the public and technical faces of the MLOps team at Wildberries. Together with my head, Vadim Kosov, I co-founded the team three years ago when it was just the two of us. Since then, we’ve scaled it to 30+ engineers, built a strong internal reputation, and established the team as a core enabler for ML product delivery. I’ve been actively shaping the team’s engineering culture, internal visibility, and collaboration with both business and technical stakeholders.

I’m also known for combining technical excellence with clear ownership, empathy, and drive. I deeply care about people, quality, and the impact of our work — and I enjoy turning infrastructure into leverage for real products.

While my main role is in Big Tech, I’ve always been drawn to the research side of machine learning. I enjoy reading papers, exploring new methods, and turning ideas into experiments or tools. This curiosity has led to several applied research thesises. It’s important for me to stay close to both the theory and practice of ML.

For more details about my experience, please see my resume .

Interests
  • ML/LLM inference optimization
  • Large scale / MultiGPU training
  • Vector Databases
  • Practical MLOps systems, encompassing model deployment automation
  • The architecture and implementation of scalable ML pipelines
Education
  • MSc in Data Science and Data Engineering, 2021 - 2023

    Moscow State University, AIMasters (former OzonMasters)

  • MSc in BigData and Machine Learning, 2021 - 2023

    ITMO University

  • BSc in Applied Math and CS in Economics, 2017 - 2021

    Saint Petersburg State University of Finance and Economics

  • High School Graduate, 2010 - 2017

    Physical and Mathematical Lyceum 239

Experience

 
 
 
 
 
Wildberries
Team Lead MLOps
May 2024 – Present Remote

Team Lead MLOps Engineer in a team of 6 developers for 5 MLOps Streams:

  1. RecSys Products - My Team and I launched and participated in ML System Design for many RecSys Business Products with ML, which increased revenue of RecSys Team to third of total revenue of WB:
    • Developed and Launched product Search by Photo V2 using deployed by our team embedding database Qdrant, daily new embeddings updates using deployed by our team Airflow and Triton for inference, that increased the revenue of product 4 times;
    • Developed and Launched product Autogenerated description of product WB card using Mixtral7bx8 in vLLM on 50.000 sellers using SAQ - this feature will be monetized in future;
    • Created Triton Instance with HPA and daily model update with zero downtime for Nearline calculation of User Embeddings in 8000 RPS;
    • Developed and created Dags in Airflow with difficult business logic and large amount of steps, integration connections (more than 15) and MIG for Item2Item Recs
  2. Pipeline Orchestration:
    • Developed and launched with my team Airflow 2.x in K8s as Pipeline Orchestrator for more than 200 ML Developers with main killer feature - launching DAGS in different K8s & Spark/Hadoop clusters from Single (!) UI.
  3. Online Inference DL/LLM:
    • Developed and standardised process for launching ML/LLM model from DS Experiment to Production using Triton Nvidia Inference Server and vLLM - Time2Market significantly decreased. Launched with my team more than 40 ML models in production: modern CV/Text Embedders, Detectors, OCR and others;
    • Developed and used lots of optimization techniques for online inference: created library Bertolt for DL utilies, such as Model weights conversion: PyTorch, Onnx, OpenVino, TensorRT; Model Analyzer for performance tuning, DALI and HPA for Triton and vLLM;
    • Made lots of research experiments with my team for choosing best framework for LLM inference: custom LLM benchmarks for Triton TensorRT LLM, vLLM, Text Generation Inference and Sglang, vLLM optimizations (Tensorize and others);
  4. ML Tracking: Patched open-source code for fileserver in ClearML to make proxy to S3 for artifacts instead of local storage, also created automatic user creation in MongoDB and launched test ClearML Agents. Launched ClearML in production on more than 300 users.
  5. ML Tools:
    • Made an major release with my team for Python library MLTool 1.0.0 with 97% coverage, automatic and versioned documentation for library by code and lots of useful features - wrappers for Triton, Airflow, DB Connectors, S3Client[s5cmd], Docker Container Fixtures for integration tests and Quick Insert to Postgres. More than 250 developers are using library and love it for clean code and candy features. We also created a survey for developers to popularize MLTool in quiz-mode - How well do you know MLTOOL?
    • Made and developed with my team lots of repo templates for quick start: launching Airflow DAG in 5 minutes, launching web service in 10 minutes and Python Library distribution using Nexus
    • Standardized process of getting documentation in actual state - made MLOps documentation as point of start for DS developers

I also made what Team Lead should do: comfortable planning process, generated new ideas for tasks, participated in tones of interviews for new MLOps developers (wrote texts for vacancies ), launched Tech Demo inside MLOps Team, participated in code review, demo’s, retro’s and grooming, wrote digest’s of sprint result and lot’s of other things.

 
 
 
 
 
Wildberries
T3 MLOps/Inference Engineer
May 2022 – May 2024 Remote

Writing code and deployment in production Online / Offline ML/DL services with full software engineering lifecycle using modern stack of technologies

Responsibilities:

  • Optimizing code for recommendation services on main page, deploying ML Models as a web service using Triton Nvidia Inference Server (Swin Transformers, VIT SIGLip, YoloV8), writing code and deploy models as a DAG’s (more than 40 pipelines using Prefect)
  • Deployed the largest open source LLM Mixtral7bx8, the first LLM in WB using Nvidia TensorRT LLM framework on 2 GPU’s using Distributed Queue Service using Celery and Redis + RabbitMQ + VLM LLaVa deployment
  • Data collection from different sources and storing data in analytical and relational databases (Postgres, ClickHouse, Greenplum and S3), writing historical data in DWH using K8S & Prefect;
  • Writing own MLOps library on python for common utils tasks (connectors for many databases, S3 loaders and backups, request wrappers) using best practices for python packaging with 100% tests coverage;
  • Writing backend and API on python using FastAPI and other services, write monitoring of services metrics using Grafana, Thanos and Prometheus, logs to Kibana + Elasticsearch stack;
  • Writing CI/CD pipelines on GitLab; Deployment of other services on k8s using helm charts;
  • Writing unit and integration tests, provide load testing, use monitoring and alerting for services;
  • Participating in infrastructure planning, demos, code reviewer.

Technologies:

  • Python 3.x, Poetry, PDM, Commitizen, Nox, Rye, Asdf, Nexus
  • FastAPI, SQLAlchemy, Celery, Streamlit, Triton Nvidia Inference Server, TensorRT LLM Backend, LLMs(Mixtral, Openchat, LLaVa)
  • PySpark, MLFlow, Airflow, Prefect, DVC, K8S CronJobs, ClearML
  • Docker, Docker-Compose, Gitlab Ci/CD, K8S, Helm Charts, K9S, Vault, Multi Instance GPU (MIG)
  • Postgres, ClickHouse, S3, Greenplum, Redis, RabbitMQ, Kafka
  • Milvus-on-cluster, Voyager, Qdrant
  • PyTest, PyTest-cov, Mypy, Black, flake8, PyLint, Ruff
 
 
 
 
 
Grid Dynamics
T1 Big Data Engineer
Jan 2022 – Apr 2022 Remote

Responsibilities: Studied cloud technologies for promotion to the role of T2 Big Data Engineer, took advanced certification courses for obtaining certificates from Microsoft and AWS.

Reason to leave: Grid Dynamics closed offices in Russia, unable to relocate.

Technologies: Python 3.x, Amazon S3, Amazon DynamoDB, PySpark, Airflow, GCP

 
 
 
 
 
Intern → T1 Data Engineer
Adhack.io → SkillFactory → JetBrains → 4People → GreenAtom
Jul 2019 – Jan 2022 Remote
It was a start of my career - I tried to find, what I really love: from intern to junior positions.

Publications

(2023). Cycle Generation Networks for Sign Language Translation. SignSayTion: implementation of the algorithm for translating Russian text into sign language using a diffusion model of video generation..

PDF

(2021). Correction of spelling errors and typos in the text using BERT. Transformer’s Features Usage to improve quality of finding errors and typos in texts.

PDF

Teammates & Mentors & References

Thank you for bringing heart, skill, and truth into everything we do ❤️ (being updated)

WB RecSys Team

Avatar

Stepan Evstifeev

WB

Team Lead - Personal Multimodal RecSys