Hi, I am Arnob!

Welcome to my portfolio

My aim is to solve challenging problems and help the society with the power of data, statistics, machine learning, and AI

Uppsala, Sweden • Eligible to Work in Sweden
Scroll to explore

About Me

Data Scientist with 5+ years of experience building and deploying production-grade forecasting systems, risk assessment models, and analytical solutions for data-driven decision support. I specialize in transforming complex analytical requirements into scalable, business-impacting deliverables.

My expertise spans time-series forecasting, credit risk modelling, anomaly detection, and A/B testing frameworks. I have successfully delivered projects processing 887K loan records achieving 94% AUC-ROC, built forecasting systems with 28% error reduction, and designed A/B tests generating 321% ROI.

I am skilled in Python, SQL, machine learning (Logistic Regression, Random Forest, XGBoost, CNN), and currently learning MLOps practices. I have experience with ETL pipeline development, REST API integration, and cloud infrastructure (AWS, GCP). A strong collaborator with international teams, I translate complex analytical requirements into scalable solutions.

I hold a PhD in Computational Physics with deep expertise in statistical modelling, Monte Carlo methods, and numerical optimization—skills that directly transfer to modern data science challenges.

Things I Can Do

I do things which I am passionate about. Those are...

Machine Learning
Python & SQL
Statistical Modeling
Data Visualization
ETL Pipelines
Table Tennis
Cooking
Board Games

Projects & Research

Portfolio of production-grade data science projects demonstrating expertise in credit risk modeling, time-series forecasting, A/B testing, and deep learning. All projects available on GitHub with documented code and analysis.

CO₂ Emission Forecasting Platform

2024
Production ETL Pipeline with Time-Series Forecasting

Developed production ETL pipeline integrating REST APIs for automated data ingestion with validation, transformation, and scheduled workflows. Implemented time-series forecasting using Prophet, ARIMA, and SARIMA models, achieving 28% error reduction through iterative model improvements and statistical validation. Features automated convergence checks and performance monitoring.

Time-Series ARIMA/SARIMA Prophet ETL Pipeline

E-commerce A/B Testing Framework

2024
Statistical Analysis & Business Impact

Designed and executed A/B testing framework for e-commerce free shipping strategy. Conducted comprehensive statistical analysis including hypothesis testing and customer segmentation, uncovering Simpson's Paradox in the data. Delivered data-driven recommendations that generated 321% ROI with +109K BRL projected profit increase through actionable business insights.

A/B Testing Hypothesis Testing 321% ROI Simpson's Paradox

CNN Anomaly Detection System

2023-2024
Deep Learning Pipeline for Image Classification

Led end-to-end deep learning pipeline processing 100K+ images: Monte Carlo generation, transformation, parallel processing, and CNN model deployment using TensorFlow and Keras. Achieved 97% accuracy for anomaly detection in large-scale imbalanced datasets, demonstrating techniques for detecting rare patterns in production environments.

Deep Learning CNN TensorFlow 97% Accuracy

GPTPlot - Open Source Data Visualization Tool

2023-Present
Published to PyPI

Developed and deployed open-source Python CLI tool for automated exploratory data analysis and scientific visualization. Supports multiple file formats (CSV, DAT, TXT) with comprehensive plotting capabilities. Published to PyPI demonstrating end-to-end product development. Currently integrating LLM for AI-assisted analysis.

Python CLI Open Source Data Visualization PyPI

All projects include detailed documentation, clean code, and reproducible analysis. View my complete portfolio on GitHub.

View GitHub Profile

Resume

Download My CV

Complete curriculum vitae with detailed research experience, publications, and skills

Download PDF

Professional Experience

Nov 2023 - Nov 2025

Machine Learning Developer

Uppsala University, Sweden

Built production-grade forecasting systems, credit risk models (887K records, 94% AUC-ROC), and A/B testing frameworks (321% ROI). Developed ETL pipelines with REST APIs, CNN anomaly detection (97% accuracy), and Monte Carlo stress-testing frameworks for financial risk assessment.

Apr 2022 - Jul 2023

Research Data Engineer

University of Tennessee, Knoxville, USA

Architected scalable data processing pipelines (40% efficiency improvement), implemented batch processing automation with HPC systems (SLURM, Cron), and developed open-source Python CLI tool (GPTPlot) published to PyPI. Collaborated with international teams on large-scale data projects.

Aug 2014 - Nov 2021

Doctoral Researcher - Computational Science

IISER Mohali, India

Developed gradient descent optimization algorithms, applied Fourier Transform for signal processing, and conducted Monte Carlo simulations for complex system modeling. Built foundation in statistical modeling and numerical optimization directly applicable to data science.

Education

PhD in Computational Physics

IISER Mohali, India

2014 - 2021

MSc in Applied Physics

IIT Dhanbad, India

2010 - 2012

Certifications

Machine Learning Specialization

DeepLearning.AI / Stanford University

2023

Physics and Finance

Uppsala University

2024

Additional Information

  • Languages English (Fluent), Swedish (Beginner, A1-A2), Hindi & Bengali (Native)
  • Work Authorization Fully eligible to work in Sweden; available for roles in Uppsala, Stockholm, and surrounding areas
  • Availability Immediate start possible; open to hybrid and on-site arrangements
  • Interests Chess, Badminton, Cooking, Table Tennis