Open to opportunities

Ivan Sokol
Data Engineer

Designing and building robust data pipelines, scalable infrastructure, and production-grade data systems that power business decisions. Based in Israel.

9+
Years of Experience
5
Companies
6+
Live Projects
Engineering the Data Layer
That Powers Decisions

I'm a Data Engineer with hands-on experience building data infrastructure from scratch in global SaaS companies. I design, build, and maintain the pipelines and systems that turn raw data into reliable, queryable assets for the entire organization.

I started this portfolio to show how I work end-to-end: from messy inputs to stable production systems. My professional principle is simple - make data reliable first, useful second, and fast to iterate on always.

My background spans from geospatial analytics and urban planning in Moscow to SaaS data infrastructure at Agritask in Tel-Aviv. I'm skilled in designing experiments to validate hypotheses and support data-driven decision-making across cross-functional teams.

Beyond analytics, I build full-stack data products — from automated scraping pipelines and NLP-powered matching tools to interactive dashboards and real-time mapping applications.

📊
End-to-End Pipelines
From raw data ingestion through ETL/ELT to dashboards and serving layers - I own the full data pipeline.
🌍
Geospatial Expertise
B.Sc. in Geoinformatics. Expert in PostGIS, GIS systems, and location intelligence.
🚀
Production-Grade Systems
I don't just analyze - I deploy. Docker, CI/CD, cloud infrastructure, and monitoring.
Tools I Work With
A practical toolbox for data engineering: ingestion, transformation, orchestration, storage, and observability.
💻
Data Analysis
Python SQL Snowflake PostgreSQL MSSQL pandas NumPy SciPy scikit-learn
📈
Visualization
Power BI Looker Grafana Tableau Qlik Plotly / Dash Matplotlib Seaborn
☁️
Cloud & Infra
AWS GCP Azure Docker CapRover CI/CD Linux
🛠️
Engineering
Git REST API Selenium Playwright Webhook SQLAlchemy Telethon PostGIS React Leaflet NLP
Data Systems Built for Production
From automated pipelines and real-time processing to deployed platforms — each system is engineered end-to-end and runs in production.
echomap.bysokol.com
Personal Project • Full-Stack
EchoMap — Mapping the News
Fully automated data pipeline that collects, classifies, and maps news events in real time. Uses Telegram API for data ingestion, Gemini AI for entity extraction and geolocation, and PostGIS for spatial queries. Features multilingual support and event categorization (protests, disasters, conflicts).
Python Telegram API Gemini AI PostGIS React Leaflet Docker CapRover
Campaign Performance Dashboard
Campaign Dashboard
Data Engineering • Analytics
E-Commerce Sales Intelligence
End-to-end e-commerce analytics platform: automated Selenium scraper collects product data on a 6-hour schedule, stores it in PostgreSQL with historical tracking. Interactive Dash dashboard shows KPIs, stock-out rates, high-demand products, price distributions, and trends — all auto-refreshing.
Python Selenium Dash / Plotly PostgreSQL Docker CapRover
Job Application Analytics — Grafana
Job Application Analyzer Dashboard
Machine Learning • NLP
Job Application Analyzer
NLP-powered tool that scrapes LinkedIn jobs via Playwright, computes resume-to-job semantic similarity using SentenceTransformer embeddings, and runs statistical analysis (t-test, Mann-Whitney, Cohen's d, chi-square) to validate whether similarity predicts interview success.
Python SentenceTransformers Playwright scikit-learn PostgreSQL SciPy
Analytical Case Studies

Engineering-focused case studies demonstrating pipeline design, statistical analysis, ML pipelines, and SQL mastery.

📱
Onboarding Funnel & A/B Testing
Mobile App Analytics
Analyzed mobile app onboarding funnel (5 stages) and evaluated A/B experiments with 95% confidence intervals. Discovered ~49.8% lift in one experiment variant. Segmented by funnel types and gender.
Python pandas SciPy A/B Testing
🚘
IMU Sensor Analysis
Automotive / Dashcam
Built ML pipeline for dashcam IMU data: feature extraction, RandomForest inference, and event detection. Designed stop-sign compliance annotation protocol with structured classification categories.
Python scikit-learn RandomForest Signal Processing
💰
Campaign & Partner Analytics
Ad Tech / Marketing
Partner attribution analysis, conversion funnels, and profitability modeling for mobile ad campaigns. Identified underperforming partners and calculated loss-per-install metrics across countries.
Python pandas PostgreSQL SQLAlchemy
❄️
Inventory SQL & Star Schema
E-Commerce / Streaming
Advanced SQL analysis: max sales between restocks using window functions, stock level calculations. Designed a star schema for audio streaming events with fact and dimension tables.
SQL Snowflake Window Functions Data Modeling
🔎
Multi-Device & VPN Detection
Cybersecurity / Identity
Classified users as single-device vs multi-device using device IDs, OS fingerprints, and IP overlap. Built VPN detection using Haversine distance and impossible-speed analysis (>1200 km/h).
Python pandas Geospatial Haversine
Selected Outcomes
Concrete, portfolio-verified results from production projects and assignments.
274 apps / 29 interviews
Built a job-search analytics system to track funnel performance and interview conversion.
+49.8% uplift
A/B experiment analysis identified a high-performing onboarding variant.
6-hour cadence
Automated production scraping + transformation pipeline with scheduled refreshes.
Geo anomaly detection
Implemented impossible-speed (>1200 km/h) and geospatial checks for VPN/fraud signals.
Professional Experience
8+ years progressing from geospatial data systems to production data infrastructure.
Aug 2025 — Jan 2026
Data Engineer / Data Analyst (Product/Geo)
Rockup • Limassol, Cyprus
  • Designed and automated scalable data pipelines using Python and SQL for production use
  • Built reusable data tables to support analytics and product use cases
  • Investigated data quality issues and improved reliability of data workflows
  • Evaluated external data providers and integrated selected sources into existing pipelines
  • Performed ad-hoc analyses when needed to support product and data decisions
Aug 2023 — Apr 2025
Data Engineer / Data Analyst
Agritask • Tel-Aviv, Israel
  • Architected and maintained the company's core data infrastructure
  • Built efficient ETL pipelines to streamline data ingestion and transformation
  • Designed monitoring and observability dashboards in Grafana and Power BI
  • Automated data workflows with Python — multi-source extraction, transformation, and loading
Nov 2020 — Aug 2023
Data Analyst
Agritask • Tel-Aviv, Israel
  • Built automated reporting pipelines tracking product and business KPIs
  • Designed data models and ran exploratory analysis to inform engineering decisions
  • Delivered interactive dashboards in Power BI, Tableau, and Looker fed by automated data flows
Dec 2019 — Nov 2020
Project Manager
Orhitec GIS • Petah Tiqwa, Israel
  • Engineered geospatial data pipelines and reporting systems for municipalities
  • Converted CAD data to operational GIS
  • Conducted field measurements using AutoCAD, ArcGIS, and orthophoto
Jun 2016 — Jul 2019
Lead Data Analyst — Geospatial
City Architecture Committee • Moscow, Russia
  • Led geospatial data engineering, driving cross-departmental data infrastructure
  • Managed a team of engineers/analysts and conducted code reviews
  • Built data pipelines and dashboards serving product and marketing teams
Academic Background
🎓

B.Sc. in Information Systems and Geoinformatics

Moscow State University of Geodesy and Cartography
2011 — 2015
GIS systems, SQL databases, web-mapping, statistics, data analysis, map design
Recommendations
Real recommendations from LinkedIn colleagues and managers.
"Ivan has exceptional technical expertise. He quickly learns new tools and technologies and applies them proactively to automate and optimize data processing pipelines. He is always up for a challenge and has no problems getting into new areas like GIS backend applications development with REST Web services or developing Qlik dashboards."
Andrew Schetinin • Technology Leader • Senior to Ivan at Agritask
"One of Ivan's standout strengths is his exceptional learning capability. He rapidly adapts to new technologies and processes, swiftly acquiring the necessary skills to integrate emerging tools effectively into our workflows. He regularly developed automated workflows that greatly enhanced productivity, saving significant time and resources across multiple projects."
Ana Silva • Earth Observation Engineer • Worked with Ivan at Agritask
"He has a strong understanding of technologies, requirements, and their real-world applications. What stands out most is his openness to new ideas and his enthusiasm for researching innovative solutions, including AI. Beyond his technical expertise, Ivan has excellent communication skills and a positive attitude, making him a valuable team player."
Alex Gavrishev • Principal Software Engineer • Collaborated on side projects
Let's Work Together
Looking for a data engineer who builds reliable pipelines, scalable infrastructure, and production-ready data systems? Let's talk.