Econometrics
10:30〜12:00
TBA
14:15〜17:20
16:45〜18:15
16:45〜18:15
Abstract
Estimating distributional structures, such as density estimation and two-sample comparison, is
a fundamental task in data science. However, estimating high-dimensional distributions is widely
recognized as challenging due to the well-known curse of dimensionality. In the case of supervised
learning, where one needs to estimate an unknown function often defined on a high-dimensional
space, a common approach in statistics and machine learning is to introduce tree-based methods,
such as boosting, random forest, and Bayesian additive regression trees. These methods are known
to be effective for such challenging tasks with feasible computation costs. This presentation aims
to introduce their counterparts for unsupervised learning. We first introduce a new non-parametric
Bayesian model for learning distributions by generalizing the Polya tree process, which is originally
introduced for low-dimensional density estimation. We next propose a new way of combining
multiple tree-based learners in the manner of boosting for improved empirical performance.
This is joint work with Li Ma (Duke University).
16:45〜18:15
アブストラクト:While average treatment effects are a common focus in causal inference, such measures often mask important distributional characteristics of counterfactual outcomes. This work considers the estimation of counterfactual densities under continuous treatments, thereby allowing richer and more detailed insights into the effects of interventions. We propose a Neyman-orthogonal moment condition that treats the conditional outcome density and the generalized propensity score as nuisance parameters. Leveraging this orthogonality within a debiased machine learning (DML) framework ensures the asymptotic normality of the parameter of interest, even when employing flexible machine learning methods for nuisance estimation. However, two challenges arise in finite samples due to the structure of the proposed moment conditions. First, the double summation within the moment conditions makes standard cross-fitting approaches susceptible to poor estimation performance, especially in small- or medium-sized datasets. To address this, we derive theoretical conditions under which DML can be implemented without sample splitting, thus mitigating performance degradation. Second, the proposed moment conditions involve integral over the nuisance estimates, meaning numerical integration errors can negatively affect estimation accuracy. Hence, it is desirable to use nuisance estimators that allow for easy analytical integration. As an illustrative example, we employ random forests as the nuisance estimator to satisfy these two requirements. We demonstrate the effectiveness of the proposed method through simulation studies.
16:45〜18:15
Abstract:
Advanced data acquisition technology nowadays has often made inferences from diverse data sources easily accessible. Fusion learning refers to fusing inferences from multiple sources or studies to make more effective overall inference. We focus on the tasks: 1) Whether/When to combine inferences? 2) How to combine inferences efficiently? 3) How to combine inference to enhance the inference for a target study? We present a general framework for nonparametric and efficient fusion learning. The main tool underlying this framework is the new notion of depth confidence distribution (depth-CD), developed by combining data depth, bootstrap and confidence distributions. We show that a depth-CD is an omnibus form of confidence regions, whose contours of level sets shrink toward the true parameter value, and thus an all-encompassing inferential tool. The approach is efficient, general and robust, and readily applies to heterogeneous studies with a broad range of complex settings. The approach is demonstrated with an aviation safety analysis application in tracking aircraft landing performance.
This is joint work with Dungang Liu (U. Cincinnati) and Minge Xie (Rutgers University).
16:45〜18:15
要旨: Diagnostic tests for regression discontinuity design face a size-control problem. We document a massive over-rejection of the identifying restriction among empirical studies in the top five economics journals. At least one diagnostic test was rejected for 21 out of 60 studies, whereas less than 5% of the collected 799 tests rejected the null hypotheses. In other words, more than one-third of the studies rejected at least one of their diagnostic tests, whereas their underlying identifying restrictions appear valid. Multiple testing causes this problem because the median number of tests per study was as high as 12. Therefore, we offer unified tests to overcome the size-control problem. Our procedure is based on the new joint asymptotic normality of local polynomial mean and density estimates. In simulation studies, our unified tests outperformed the Bonferroni correction.
10:30〜12:00
法経済学部東館 2階 201演習室