Simon Business School

Faculty Profile

Yaron Shaposhnik
Assistant Professor
Phone: 585.275.5592
Office: 3-343 Carol Simon Hall

Research Interests

Stochastic dynamic optimization with learning, Applications of machine learning to model analysis, Business Analytics

Professional History

Assistant Professor
University of Rochester, Simon Business School
September 2016 -


Massachusetts Institute of Technology - 2016
Ph D
Operations Research


Mining Optimal Policies: A Pattern Recognition Approach to Model Analysis
Journal/Publisher/Proceedings Publisher: Informs Journal on Optimization
Scheduling with Testing
Journal/Publisher/Proceedings Publisher: Management Science
Volume: 65
Issue: 2

Current Research Programs

A Generalized Pandora's Box Problem with Partially Open Boxes
Motivated by modern job recruiting practices, we study a generalization of Pandora's box problem (Weitzman 1979) in which boxes (candidates) could be partially opened (interviewed remotely) at a reduced cost prior to being fully opened (interviewed onsite). This allows the decision-maker to obtain information about boxes in the form of more accurate probability distributions without committing to fully opening them. We identify structural properties of the optimal policy and develop simple and intuitive algorithms with near-optimal performance guarantees.
A Holistic Approach to Interpretability in Financial Lending: Models, Visualizations, and Summary-Explanations
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations<br>to users. In a future world without such secrecy, what decision support tools would one want to use for<br>justified lending decisions? This question is timely, since the economy has dramatically shifted due to a<br>pandemic, and a massive number of new loans will be necessary in the short term. We propose a framework<br>for such decisions, including a globally interpretable machine learning model, an interactive visualization of<br>it, and several types of summaries and explanations for any given decision. The machine learning model is a<br>two-layer additive risk model, which is decomposable into subscales, where each node in the second layer<br>represents a meaningful subscale, and all of the nonlinearities are transparent. Our online visualization tool<br>allows exploration of this model, showing precisely how it came to its conclusion. We provide three types of<br>explanations that are simpler than, but consistent with, the global model: case-based reasoning explanations<br>that use neighboring past cases, a set of features that were the most important for the model’s prediction, and<br>summary-explanations that provide a customized sparse explanation for any particular lending decision made<br>by the model. Our framework earned the FICO recognition award for the Explainable Machine Learning<br>Challenge, which was the first public challenge in the domain of explainable machine learning<br><br>
Crowdsourcing the Prediction of Sojourn Times: Methods and Feasibility Study
Mobile apps such as Google Maps, Bing Maps, or Waze are regularly used by users to predict travel times. This paper explores a similar application in the context of general service systems (rather than transportation problems which have their own physical characteristics). We develop analytical and data-driven prediction methods for predicting sojourn times in service systems by 3rd parties that have partial information about users' interaction with the system, and conduct numerical experiments to assess the quality of predictions and the overall feasibility of this practice.
Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation
We develop a method for interpreting specific predictions made by (global) predictive models by constructing (local) models tailored to each specific observation (these are also called ``explanations" in the literature).<br>Unlike existing work that ``explains'' specific observations by approximating global models in the vicinity of these observations, we fit models that are globally-consistent with predictions made by the global model on past data. We focus on rule-based models (also known as association rules or conjunctions of predicates), which are interpretable and widely used in practice. We design multiple algorithms to extract such rules from discrete and continuous datasets, and study their theoretical properties. Finally, we apply these algorithms to multiple credit-risk models trained on real-world data from FICO and demonstrate that our approach effectively produces sparse summary-explanations of these models in short period of time. Our approach is model-agnostic (that is, can be used to interpret any predictive model), and solves a minimum set cover problem to construct its summaries.
Scheduling with Testing of Heterogeneous Jobs
This paper studies a canonical general scheduling model that captures a fundamental tradeoff between processing jobs to performing diagnostic (testing). In particular, testing reveals information regarding the required processing time and urgency of need-to-scheduled jobs to inform future scheduling decisions. The model captures a range of important applications.<br>Prior work focused on very special cases (e.g., jobs with IID processing time) to devise optimal policies. In contrast, the current paper studies the most general form of the model and describes a simple heuristic that is analytically guaranteed to have near-optimal performance and performs numerically well in computational experiments. The design of the newly proposed heuristic and related worst-case performance analysis, rely on interesting connections to related stochastic optimization bandit problem. Additionally, the paper devises optimal policies to several important extensions of prior models that were studied.
Simple Rules for Predicting Congestion Risk in Queueing Systems: Application to ICUs
We study the problem of predicting congestion risk in service systems.Congestion is associated with poor service experience and higher costs and may even put users at risk as in the case of medical settings, such as ICUs. By predicting future crowdedness, decision-makers can initiate preventive measures such as rescheduling activities or increasing short-term capacities in order to mitigate the effects of congestion. To this end, we define ``high-risk states'' in queuing models as system states that are likely to lead to a congested state in the near future, and strive to formulate simple rules for determining whether a given system state is high-risk. We show that for simple queueing systems, such as the $M/M/\infty$ queue with multiple user classes, such rules could be approximated by linear and quadratic functions on the state space. For more general queueing systems, we use methods from queueing theory, simulation, and machine learning\st{develop a computational framework} to devise simple prediction rules, and demonstrate their effectiveness through extensive computational study. Our study suggests that linear rules (which are widely considered to be interpretable) are very accurate in predicting congestion in ICUs.
Stochastic Selection Problems with Testing
We study the problem of a decision-maker having to select one of many<br>competing alternatives (e.g., choosing between projects, designs, or<br>suppliers) whose future revenues are a priori unknown and modeled as<br>random variables of known probability distributions. The decision-maker<br>can pay to test each alternative to reveal its specific revenue realization<br>(e.g., by conducting market research), and her goal is to maximize the<br>expected revenue of the selected alternative minus the testing costs.<br>This model captures an interesting trade-off between gaining revenue of<br>a high-yield alternative and spending resources to reduce the uncertainty<br>in selecting it. The combinatorial nature of the problem leads to a<br>dynamic programming (DP) formulation with high-dimensional state<br>space that is computationally intractable. By characterizing the structure<br>of the optimal policy, we derive efficient optimal and near-optimal<br>policies that are simple and easy-to-compute. In fact, these policies are<br>also myopic -- they only consider a limited horizon of one test. Moreover,<br>our policies can be described using intuitive `testing intervals' around<br>the expected revenue of each alternative, and in many cases, the<br>dynamics of an optimal policy can be explained by the interaction<br>between the testing intervals of various alternatives. Finally, we show<br>that some of the insights and results obtained for the problem carry<br>through to more general stochastic combinatorial optimization problems<br>with testing.<br><br>
Understanding How Data Visualization Tools Work
Dimensionality reduction (DR) techniques such as t-SNE, LargeVis, UMAP, and Trimap have demonstrated impressive visualization performance on many real world data sets. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure--methods can either handle one or the other, but not both. In this work, our main goal is to understand what aspects of DR methods are important for preserving structure: it is difficult to come up with a better method without a true understanding of the choices we make in our algorithms and their empirical impact on the embedding. We provide several useful design principles based on our new understanding of the mechanisms behind successful DR methods, and use them to design a new algorithm for DR that trades off gracefully between local and global structure preservation.
Go to the top of the page
Get more information about Simon
Apply to Simon