Francisco Blasques

Professor of Econometrics
and Data Science

Vrije Universiteit Amsterdam

Director of Data Science
Metyis Netherlands

Partner & Co-founder
QuantIQ (churned)
datastuff and ACEDA

Research Fellow
Tinbergen Institute

E-mail: f.blasques@vu.nl
Tel: +31 205 985 621

Data science projects

Data science projects pursued in the role of supervisor (professional or academic), chief or lead data scientist. Some are short pilot projects, taking 6 months to 1 year. Other projects were developed over the course of 1-2 years. A few are long-run collaboration projects spanning more than 3 years.

All-in-one customer retention platform

High-end data science engine for churn prediction and prevention. Fully automated AI product for both B2B and B2C, featuring churn and customer life-time value prediction & prevention, in-depth engagement metrics, next best action, and much more. Take a look at churned.io.


Natural language processing for reading-app

Mixture of experts model engine aimed at delivering automated speech correction feedback. Using artificial data generators and natural language processing (NLP) machine learning techniques in an ensemble mixture of experts environment.


Signal processing for real-time inventory insights

AI engine for low-fi sensor technologies aimed at achieving stable and reliable high-accuracy prediction and classification of on-stock items. Featuring multi-stage mixture-of-experts ensemble model focused on unlocking automated inventory management insights.


High frequency air-cargo demand forecasting

Air cargo prediction and forecasting using zero-inflated panel-data time-series analysis techniques. Forecast delivered intra-day, at hourly frequency, over both arrivals and departures, at multiple horizons, and ranging through multiple structural breaks.


Power of vibration sensors in prediction of slopping for
steel manufacturing

Developing early-warning systems and prevention actions for slopping using multiple real-time online inputs. Testing the role of new vibration sensors on multi-step-ahead prediction of slopping in steel manufacturing.

Predicting surgery times for optimisation of schedules

Analysing vast-dimensional dataset and deploying ensemble ML techniques to predict surgery times based on characteristics of medical intervention, patient profile, surgery team, and more. Achieved substantial gains in predictive accuracy delivering substantial improvement in scheduling, overall planning, and financial costs.

Determining product life-cycle stages

Using penalised dynamic panel-data filtering techniques to extract unobserved components from product sales dynamics and ultimately cluster and classify products into different life-cycle stages.


Forecasting demand for agricultural products

Forecasting multi-product assortment with time-series econometrics models featuring trend-seasonality-cycle decomposition and structural breaks. Product clustering based on unobserved time-series characteristics such as seasonality strength and trending behaviour.


Structural analysis of marketing and communication campaigns for charity organisation

Using nonlinear observation-driven model to filter adstock and optimise marketing intensity, marketing mix, and marketing budgets. Model takes into account customer clusters and personalisation, diminishing returns to scale on marketing campaigns, extreme saturation, and cross-pollination across campaigns.

Structural modelling of electricity spot prices and merit order dynamics

Design and validation of a complex structural model to capture real-time merit-order-dynamics and price formation in electricity market. Featuring both demand and supply interactions, filtering the impact of multiple supply sources on spot prices as the merit order dynamics unfold in real-time. Further featuring international market spill-overs through dynamic electricity imports and exports. Used to obtain correct measure of relation between wind energy output and the prevailing market price.

Anomaly detection on general practitioners’ recommendations

Leveraging large dataset of referrals to medical specialties in order to measure biases in general practitioners’ recommendations and detect anomalies in treatment and follow-through. Incorporating measurements into transparency and feedback goals.

Modelling educational flows in the Dutch educational system

Dynamic factor time-series econometric techniques used to forecast and obtain insights into the participation levels and student flows within the complex granular Dutch education system. Both forecasts and data insights are of key importance for operational and financial planning.


Further advances on pricing of wind turbines

Using penalised non-parametric techniques to model a dataset of prices of wind turbines. Sensitivity analysis on key determinants of wind-turbine prices and development of long-term forecasts and scenario analysis.

Deep learning in securities fraud and market stability

Employing deep learning techniques in fraud detection and modelling of market instability. Using bootstrapping and simulation-based techniques to develop warning flags for fraudulent operations with reliable statistical confidence levels.

Data-driven demand planning and dynamic pricing optimization

Demand planning and dynamic pricing solutions for large product assortment, using structural ensemble methods with reliable instrumental cost variables and discontinuity measures.

Optimisation of acceptance scorecards for consumer lending products

Deploying ensemble Machine Learning regression and classification methods to optimise acceptance score cards for consumer lending products. Using multiple sources of information to minimise default risk and maximise long-run compliance, stability and robustness.

Predicting short-term migration flows

Dynamic panel data methods for non-stationary multi-country socio-economic dataset. Used to predict short-term migration flows and identify key drivers behind substantive migration dynamics.

Predicting and preventing slopping in steel manufacturing

Using online data-driven filters for real-time multi-second-ahead prediction of slopping events in steel manufacturing. Further multi-second-ahead prediction of slopping probabilities for advanced warning systems.

Dynamic control inventory systems

Optimising inventory management by designing business-specific data-driven decision rules on re-ordering moment and quantity. Trading-off benefits of infrequent re-ordering, large volume discounts, and multi-product bundling, against inventory space requirements and risk of out-of-stock events.

Measuring extreme saturation in advertisement campaigns

Nonparametric methods used to measure extreme saturation in advertising. Extreme saturation causes consumers to actively resent additional advertisement from a given company beyond a certain level of marketing intensity directed at them. This threshold can be consumer-specific and leads to a negative relation between adstock and propensity to buy. Improved ROI can be easily achieved by reducing campaign intensity.

Bayesian tensor deep learning for digital image processing and classification

Leveraging Bayesian neural networks for image recognition and classification of skin-lesions. Data-driven melanoma detection obtained through automated image processing algorithm.

Data-driven dynamic pricing with structural
causal machine learning models

Delivering automated dynamic pricing solutions for large retail and consumer electronics chains, featuring price elasticities on nonparametric models and ensembles, complex cross-product interactions, inventory targets, traffic generators, KVIs, cyclical product updates, holiday promotions, and end-of-season price markdowns.

High-dimensional demand forecasting

Comprehensive high-dimensional time-series forecasting. Automated engine designed to deliver high-frequency multi-horizon demand predictions across very large product assortment, at multiple levels of granularity, and multiple international markets.

Predictive maintenance for wind turbines

Optimization of maintenance staffing and scheduling using structural parameter-driven models and simulation-based indirect inference techniques. Model trades-off between potential revenue loss and maintenance costs, incorporates time-varying probability of failure, dependence on external drivers such as wind-speed, and generates both spatial and temporal clustering of failure rates.

Image processing for AI-based melanoma risk assessment

Leveraging indirect inference, auxiliary statistics, chaotic process theory and convolutional neural networks to assess the risk of melanoma in patients. Improvements on earlier project include enhanced predictive accuracy and robustness with asymmetric loss function.

Network modelling of European interbank market

Structural dynamic network model for the European interbank market. Combining game theory and indirect inference methods. Project funded by the SWIFT institute in the UK.

Product recommendation for large e-commerce platform

Leveraging structural causal supervised models and AB-testing for optimal online product recommendation. Featuring real-time feedback on product views, smart segmentation, product-push integration, promotional events, seasonal trends and priorities, inventory restrictions, etc.

Data-driven marketing optimisation

Using nonlinear observation-driven dynamic model to filter adstock and optimise marketing intensity, marketing mix, and marketing budgets. Model takes into account customer clusters and personalisation, diminishing returns to scale on marketing campaigns, extreme saturation, and cross-pollination across campaigns.

Improving inventory systems with dynamic control models

Achieving advanced data-driven inventory management by combining structural time-series, operations research, and AI techniques. Leveraging structural filtering techniques to minimise both total inventory volume and out-of-stock events at the same time. Filtering takes into account current inventory, expected product demand, order times, product shelf-life and inventory restrictions.

Predicting financial risk and market volatility

Applying dynamic volatility filtering to high-dimensional environments with composite likelihood methods. Application of new robust filtering techniques in high dimensions for portfolio and market risk assessment.

Real estate value prediction using deep learning

Predicting real-estate prices for vast dataset of housing market in California. Using semi-parametric approach featuring linear regression and sparse ANN deep learning to leverage both structured and unstructured data sources, and model both simple and complex interactions between real estate prices and its determinants.

Demand planning: breaks and scenario analysis

Demand planning featuring large-but-predictable structural breaks in demand and inventory rupture. Structural model used for simulation-based scenario analysis of disruptive promotional campaigns and marketing impact.


Demand planning: agricultural seeds

Obtaining long terms forecasts for sales of agricultural seeds at multiple levels of granularity. Ensuring internal consistency between forecasts at sku level, seed type, segment, country, and specificity. Deploying time-series classification methods to ensure correct targeting of sub-groups of time-series with trends, seasonal effects, breaks, etc.


Churn prevention: causal methods and AB-testing

Churn prediction and prevention for customer segments using dynamic structural XGBoost and LightGBM with causal instrumental variables for marketing impact and optimisation. Strengthening of structural causal, validation and testing though limited short-sample AB-testing.

Risk modeling: extreme events with missing data

Filtering probabilities of extreme events in incomplete panels of financial assets. Applying long-tail models to capture extreme event probabilities in big-data setting with structural breaks and time-varying uncertainty.

Pricing airline tickets with meta-data

Dynamic data-driven pricing of airline tickets using meta-data from third-party travel websites and applications. Improving demand prediction and optimal pricing through parametric and non-parametric modelling of demand drivers and pricing effects.

Spatial modelling of housing market trends and volatility

Sparse spatial machine learning algorithms coupled with statistical spatial models for prediction of housing prices and spatial volatility. Identification of trends in determinants of real-estate prices across space and time.

Mortgage interest rate optimization on multi-agent oligopoly

Dynamic factor multi-agent causal regression model for optimisation of mortgage interest spreads. Obtaining game-theoretic equilibria for oligopoly market which is subject to external drivers.

Segmentation and prediction of school performance

Building predictor of student-school performance in rich big data setting. Leveraging features from school characteristics, student population profile, and local spatial information, among others.

Out-of-sample spatial network modelling of demand for new airline flight routes

Spatial gravity network models for predicting out-of-sample demand for new flight routes in the airline industry. Deployment of demand scenario analysis on new routes based on social-networking and travel-agency meta-data information.

ALM and data-driven portfolio optimisation

High-dimensional risk modelling, volatility filtering, and dynamic control optimiser for real-time financial portfolio optimisation.


Optimising ROAS for AdWords campaigns

Deploying structural-causal non-parametric regression methods to measure and optimise ROAS in online marketing campaigns. Featuring optimal ad expenditure across customer clusters, time of the day, day of the week, and seasonality effects.

Filtering and optimization of wind-turbine efficiency

Robust filtering of wind-direction for optimal pitch and yaw alignment of wind turbines. Using observation-driven score filters for multi-second-ahead prediction of wind speed and wind direction. Coupled with optimal control model for real-time optimisation of pitch and yaw.