NUS: Department of Statistics and Applied Probability
NUS Home | Search: in Go
Back to NUS homepage
 Home > Seminar
 
 

Seminar Details

Title: Feature Selection in High Dimensional Regression: A look at LASSO and Correlation Screening

Speaker: Mr Lim Chinghway, University of California, Berkeley and Department of Statistics and Applied Probability, National University of Singapore

Date: 21 August 2008 (Thursday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room

Abstract

Abstract In the modern information age, it is increasingly common to find statistical applications with a large number of covariates. Further complicated when there are fewer observations than covariates, this presents a new challenge for the problem of feature selection. In the realm of linear models, the Lasso has emerged as a popular technique to recover sparsity. There has been considerable amount of work on its consistency results as well as extensions to the technique.

Separately, correlation screening, as simple as it sounds, has gathered renewed interest in recent work. In this talk, I will give an overview of the two methods and present some of their consistency results. I will discuss their limitations and how using both together can harvest their individual merits. I will also give a brief summary of their extensions to the generalized linear model and highlight some problems of interest.



Title: On a Constrained ARCH Model for the Prediction of VaR


Speaker: Mr Wang Mengxi, Department of Statistics and Applied Probability, National University of Singapore

Date: 15 August 2008 (Friday)

Time: 3:00pm - 4:00pm

Venue: S16-07-107, DSAP Reading Room

Abstract

Risk Management is an important aspect of financial industry and has gained increasing attention after recent financial turmoil triggered by insufficient or ineffective risk management practice. Value-at-Risk (VaR), an advanced technique of modeling risk of assets has been widely accepted and received increasing popularity of not only financial firm, but also industrial corporations over the past two decades. This method is appealing because of its ability to integrate several market risk factors into one single measure, often as a dollar term or a percentage of the asset value to express the potential loss over a specific period of time. As the most important input variable for estimating VaR, financial volatility forecast has become the center of the problem and an important field in the statistics research.

It is often recognized that financial time series have some prominent characteristics, including volatility clustering, i.e. large changes tend to be followed by large changes and small changes by small changes; Leptokurtosis, or fat tailed distribution of the financial returns; and lastly they often show leverage effect, that is the changes in stock pries tend to be negatively correlated with changes in volatility, meaning volatility is higher after negative shocks than after positive shocks of the same magnitude. Many researches have been done in this area and it has been proved that some popular models like GARCH and ARCH have the ability to model the non-normality of the returns as well as changing conditional volatilities, hence provide good future estimation. Some improved models such as GJR and EGARCH have been developed to capture the leverage effect of the financial returns. In this project, a new model namely Constrained Volatility ARCH (CARCH) model was proposed to provide better balance between model flexibility and stability by imposing constraints on the coefficient constraint. This constraint results in a natural selection process driven by the data itself to achieve a more parsimonious model with better flexibility than the GARCH (1,1) model, yet with better prediction, measured by smaller MAD (Mean Absolute Deviation). Some financial time series data was tested in this study and the results suggested the new CARCH model outperformed the other conventional GARCH models with significant superiority in terms of quartile prediction and risk management perspective.



Title: Integration of Heterogeneous Datasets for the Prediction of Directly Regulated Genes


Speaker: Mr Deng Niantao, Department of Statistics and Applied Probability, National University of Singapore

Date: 01 August 2008 (Friday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room

Abstract

Estrogen Receptor is a master transcriptional regulator in breast cancer and is an archetype of a molecular therapeutic target. Experiments have been performed to map ER binding sites on a genome-wide basis using various chromatin immunoprecipi- tation (ChIP) techniques. Lin and Vega (2007) applied the ChIP- PET strategy to map ER binding sites in MCF-7 cancer cells and found that only 5% of the ER binding sites were within the proximal gene promoter regions, while the majority were mapped further away from genes. In order to understand the ER impact on regulation, we integrated various datasets and explore the association between binding sites and regulated genes from the aspects of their distance, the binding strength and the concentration of the binding regions. We identified some important factors which contribute to the direct regulation and tentatively proposed a score function for genes to measure their potential to be directly regulated. The numerical results have been shown between control gene group and expressed gene group and are compared by the Receiver Operating Characteristic (ROC) curve analysis.



Title: A Moment Substitution Approach to Fitting Linear Regression Models withCategorical Covariates Subject to Randomized Response


Speaker: Mr Wang Zijian, Gerald, Department of Statistics and Applied Probability, National University of Singapore

Date: 28 July 2008 (Monday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118, Seminar Room

Abstract

In this paper, we present an alternative approach to Van den Hout and Kooiman (2006) for estimating the linear regression model with categorical covariates subject to randomized response (RR). Specifically, we consider Warner's (1965) scheme of randomization. Our approach essentially consists of moment substitution, where we estimate the latent first, second and cross product moments in the usual least squares estimator for the centred model with their associated observed unbiased estimates. For the problem of estimating subgroup means in a dichotomous population, we show that this moment substitution approach is equivalent to Selen's (1986) estimator under appropriate distributional assumptions. Assuming independent randomizations, this approach is further adapted to the case of multiple linear regression, when some or all of the covariates are subject to RR. Ultimately, it is shown that the estimates yielded by this method are asymptotically equivalent to the measurement error model estimates of Fuller (1987) under suitable transformations.



Title: Statistical Analysis of a Time- Course Nasopharyngeal Carcinoma Gene Expression Data


Speaker: Mr Md. Atikur Rahman Khan, Department of Statistics and Applied Probability, National University of Singapore

Date: 23 July 2008 (Wednesday) changed to 25 July 2008 (Friday)

Time: 1:30pm - 2:30pm changed to 3pm - 4pm

Venue: S16-06-118, Seminar Room


Abstract

A common goal of microarray is to identify genes that are differentially expressed in different biological conditions. Time-course microarray experiments can be used to detect temporal differential gene expressions in these conditions. Our aim in this study is to investigate the time-course regulation and differential expression of genes cell lines in a dataset from an in vitro experiment that uses cyclin dependent kinase (CDK) inhibitor on 3 Nasopharyngeal Carcinoma (NPC) cells. We explored the different aspects of this dataset: hierarchical clustering based on distance measure and analyzed the data using principal component analysis. Time-course regulation of genes were studied by using time-course pattern and profile analysis. We performed gene ontology (GO) category enrichment together with the differential gene expression analysis and hypothesized some genes on different pathways which were significantly responded to that CDK inhibitor.



Title: Bootstrap Methods for Semi-Parametric Goodness of Fit Tests


Speaker: Prof G. Jogesh Babu, Department of Statistics, The Pennsylvania State University

Date: 23 July 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118, Seminar Room


Abstract

Nonparametric goodness of fit tests are generally based on the empirical distribution function. A well discussed problem of goodness of fit tests when parameters are estimated will be revisited. Bootstrap methods to estimate the null distributions, of various goodness of fit test statistics, will be presented. These results hold not only in the univariate case but also in the multivariate setting. These ideas are taken a step further to develop non-parametric resampling methods for inference, when the data comes from an unknown distribution which may or may not belong to a specified family of distributions.



Title: Multivariate Linear and Nonlinear Causality Tests with Applications


Speaker: Miss Zhang Bingzhi, Department of Statistics and Applied Probability, National University of Singapore

Date: 21 July 2008 (Monday)

Time: 3:00pm - 4:00pm

Venue: S16-06-118, Seminar Room


Abstract

The traditional linear Granger test has been widely used to examine the linear causality between any pair of time series. Hiemstra and Jones (1994) developed a nonlinear Granger causality test to investigate the nonlinear causality between stock prices and trading volume. In this thesis, we extend their work by developing both linear and non-linear causality tests in multivariate settings instead of in pairwise context. We then apply the tests to identify the linear and non-linear multivariate causality relationships among the indices of the Chinese segmented stock markets..



Title: Option Pricing with Aggregation of Physical Models and Empirical Learning
[Joint Seminar Between DSAP and RMI]

Speaker: Prof Jianqing Fan (Princeton University) and Prof Loriano Mancini (Bendheim Centre for Finance -Princeton University)

Date: 10 July 2008 (Thursday)

Time: 3:30pm - 5:30pm

Venue: S14-03-10, CRA , Department of Mathematics
change to S16-06-118 (Seminar Room)

Abstract

Financial mathematical models are useful tools for option pricing. These physical models provide a good first order approximation to the underlying dynamics in the financial market. Their pricing performance can be significantly enhanced when they are combined with statistical learning approaches, which empirically learn and correct pricing errors through estimating state price densities. In this paper, we propose a new semiparametric technique for estimating state price densities and pricing financial derivatives. This method is based on a semiparametric approach to estimating the survivor function of a normalized state variable and is easy to implement. Our method can be combined with any model-based pricing formula to correct the systematic biases of pricing errors and enhance the predictive power. Empirical studies based on S&P 500 index options show that our method outperforms several competing pricing models in terms of predictive and hedging ability.



Title: Estimation of High-dimensional Covariance Matrix
[Joint Seminar Between DSAP and RMI]

Speaker: Prof Jianqing Fan, Princeton University

Date: 10 July 2008 (Thursday)

Time: 2:00pm - 3:00pm

Venue: S14-03-10, CRA , Department of Mathematics
change to S16-06-118 (Seminar Room)

Abstract

High dimensionality comparable to the sample size is a common feature in portfolio allocation, risk management, genetic network and climatology. In this talk, we first use a multi-factor model to reduce the dimensionality and to estimate the covariance matrix for portfolio allocation and risk assessment. The impacts of dimensionality on the estimation of covariance matrix and its inverse are examined. We identify the situations under which the factor approach can gain substantially the performance and the cases where the gains are only marginal, in comparison with the sample covariance matrix. Furthermore, the impacts of the covariance matrix estimation on portfolio allocation and risk management are studied. Viable covariance modeling and sparse and robust portfolio allocations are recommended based on our mathematical results.

In other class of problems such as genetic network or climatology, sparsity of the covariance matrix or its inverse arises naturally. We then estimate high-dimensional covariance matrices using the penalized likelihood method to explore the sparsity. New algorithms are proposed. Optimal rates of convergence, sparsistency, and asymptotic normality are established. Our theoretical results are verified by simulation studies and illustrated by several applications.



Title: Testing for Interactions in General Semiparametric Analysis of Repeated Measures Data, With Application to Testing for Main Effects of Genes with Possible Environmental Applications

Speaker: Prof Raymond J. Carroll, Distinguished Professor, Professor in the Department of Statistics, Professor of Nutrition and Toxicology, Department of Statistics, Texas A&M University, TAMU

Date: 09 July 2008 (Wednesday)

Time: 11:00am - 12:00noon

Venue:

Abstract

This talk considers the general problem where the data for an individual are repeated measures in the most general sense, with a parametric component and a nonparametric component. In gene-environment interaction studies, it is often of interest to test for the main effects of genes (the parametric components) when there might be interactions with the environment (the nonparametric component). Rather than build complex models for the interactions, we use a Tukey-type 1-degree of freedom formulation that has the promise to improve power for testing whether there are any genetic effects. We derive a general profile-type score statistic and show how to implement it, which involves circumventing the need to solve an integral equation. Extensions to semiparametric additive models with repeated measures are described.



Title: Risk-Adjusted Cumulative Sum Control Charting Procedures

Speaker: Miss Lin Lin, Department of Statistics and Applied Probability, National University of Singapore

Date: 07 July 2008 (Monday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

Risk-adjusted charts for monitoring the performances of a surgeon or a group of surgeons have recently gained their prominence in the literature. It started with the introduction of a chart that plots cumulatively the expected mortality counts minus the observed counts in 1997. The statistic plotted is intuitive and it has gained widespread attention and adoption. However, the run length performance of this chart is still not clearly understood because of the lack of a signalling rule. A cumulative sum chart based on testing the odds ratio that a patient dies was proposed in 2000. The run length performance of this chart is optimal but the interpretation of this chart is not as easy because of the inherent difficulty in inter¬preting odds ratio. Between 2000 and 2008, many papers were published comparing these two charts. Although these two charts look seemingly different, we show that they are in fact mathematically identical and we present a unified approach based on testing the risks directly.



Title: Pricing and Hedging of Barrier Options under Transaction Costs

Speaker: Miss Lim Pei Ling, Department of Statistics and Applied Probability, National University of Singapore

Date: 07 July 2008 (Monday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

A barrier option is one of the most popular exotic options for structured products. Barrier options can be divided into two categories – the knock-out or knock-in options. The knock-out (resp. knock-in) option is expired (resp. exercisable) automatically when the underlying stock price hits the predetermined barrier level. The problems of pricing and hedging barrier option in the presence of proportional transaction costs can be formulated as singular stochastic control problems. Thus far, the optimal hedging strategies have been computed numerically using the Markov chain approximation and the discrete-time dynamic programming. However, this approach is computationally intensive. Lai and Lim (2006) proposed a new approach and an efficient backward algorithm to solve the problem of option pricing and hedging by using the equivalence of optimal stopping to the class of singular stochastic control problems. In this paper, we apply this new proposed approach, under the case of negative exponential utility, to study the hedging strategies for the “up-and-out” barrier option in the presence of transaction costs. The technique results in the optimal hedging strategy that involves two optimal buy and sell boundaries. The numerical results are also studied in this paper.



Title: Regression Spline via Penalizing Derivatives

Speaker: Miss Zhu Yeying, Department of Statistics and Applied Probability, National University of Singapore

Date: 04 July 2008 (Friday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

Regression spline based on a truncated power basis has been proved to be a very useful nonparametric method. One way to implement this method is to approximate the unknown underlying function as a linear combination of the truncated power basis and estimate the coefficient vector appropriately. In situations when the coefficient vector is large-dimensional and sparse, the SCAD method can be used to select and estimate the non-zero components simultaneously. In other cases, when the coefficient vector is not sparse, but the pth times derivatives of the regression spline function are sparse, directly applying the SCAD method is less effective. In this thesis, we attempt to re-parameterize the coefficient vector as a linear function of certain derivative vector, whose last K + 1 components are the pth times derivatives of the regression spline function. We then apply the SCAD method to estimate the new coefficient vector. Numerical results show that the newly proposed method is much more accurate than the usual regression spline methods, especially when the true curve is piecewise with different orders of polynomials at different segments.



Title: Pattern Theorem on Hexagonal Lattice

Speaker: Mrs Pritha Guha, Department of Statistics and Applied Probability, National University of Singapore

Date: 16 June 2008 (Monday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

Please click here for abstract.



Title: Insights into the Mammal Radiation from Weird Australian Mammals

Speaker: Dr Gavin Huttley, Australian National University, Australia

Date: 21 May 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

The recent publication of the Platypus genome sequence 'completes' the sampling of all major mammal taxonomic divisions, with genome sequence now available for a monotreme, a marsupial and multiple eutherian lineages. The availability of the marsupial and monotreme lineages combined with an additional bird outgroup, provide the essential references from which to infer the molecular events responsible for the emergence of mammals. They also allow examination of the mode and tempo of evolutionary divergence among eutherian lineages. So can we explain the molecular basis of uniquely mammalian traits such as lactation and mammogenesis; X chromosome inactivation; and homeothermy? Can we identify the genes at which molecular changes arose that underpin these characteristic phenotypes? Can we even resolve the relatively straightforward question of the evolutionary relationships among eutherian lineages? I will illustrate how one of the simpler genomic properties that differs between the sampled mammal genomes -- genomic nucleotide composition -- is confounding efforts to estimate relationships, rates of evolution and estimates of adaptive evolution.



Title: Estimating Population Size from Multiple Lists

Speaker: Dr Mao Changxuan, Department of Statistics, University of California, Riverside

Date: 28 April 2008 (Monday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

The Rasch model is adopted to estimate the unknown population size in multi-list surveillance studies (disease, drug abuse, etc.) It takes both the list effectiveness and case heterogeneity into account. A stepwise approach is proposed in which optimization problems are solved conveniently.

The sharpest lower bound to the odds that a case is unseen is introduced, which can be calculated by linear programming. There are also some less sharp lower bounds. Estimating a lower bound leads to an estimator for the population size. Real examples are investigated.



Title: Expected number of real zeros of a random polynomial with independent, identically distributed, symmetric, long-tailed coefficients

Speaker: Prof Larry Shepp, Deaprtment of Statistics, Rutgers University

Date: 24 April 2008 (Thursday)

Time: 11:00am - 12:00pm

Venue:

Abstract

Please click here for abstract.



Title: Systems Bioinformatic Approaches for Characterizing, Engineering and Designing Complex Biological Systems

Speaker: Dr Lee Dong-Yup, Department of Chemical and Biomolecular Engineering, National University of Singapore

Date: 23 April 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

Recent advances in high-throughput experimental techniques are now allowing us to study various omics data sets for the global understanding of complex biological systems. Concurrently with the high-throughput experiments, it is also increasingly accepted that in silico modeling and simulation improve our capability to elucidate the functions and characteristics of complex cellular systems. Thus, it is highly desirable to establish a systems bioinformatic platform for integrating wet-experiments, concomitant statistical data analysis and in silico modeling at the systems level. My research projects @ NUS & BTI focus on the development of systemic, integrative and bioinformatic approaches and their applications to complex biological systems to understand and characterize such systems, and to effectively achieve desirable properties of the systems by resorting to modeling, control, optimization and data analysis techniques. This talk highlights several on-going projects including statistical analysis of various omics data. Future perspectives on systems bioinformatics and some technical challenges are also discussed.

Dong-Yup Lee is an assistant professor of Dept. of Chemical and Biomoleculer Engineering at National University of Singapore (NUS), with a joint appointment at the Bioprocessing Technology Institute (BTI) of A*STAR. He received his PhD in Chemical and Biomolecular Engineering from KAIST. Prior to joining NUS and BTI in 2005, he was a senior researcher at Bioinformatics Research Center at KAIST. His main research interests are in the application of systems methodologies to understanding and designing biological and biomedical systems in a global scale. Main research fields include Systems Biology/Biotechnology/Bioinformatics, Drug & Disease Modeling and Control, and Supply Chain Management. He has coauthored about 30 research articles on these and other topics.



Title: Probabilistic and Statistical Study of Markov Models using Regeneration Techniques

Speaker: Prof. Stephan Clemencon, Telecom Paristech - LTCI UMR No. 5141 Institut Telecom/CNRS & Metarisk - INRA

Date: 22 April 2008 (Tuesday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

In this talk, we shall describe new concepts and results in the field of probabilistic and statistical analysis of Markov chains, discrete-time processes widely used in the applications for modelling random phenomena with a causality. The description of the behavior of the chain in terms of renewal processes is used here not only as a tool for proving theoretical results of probabilistic nature (deviation inequalities, Edgeworth expansions, etc.) but also as a practical manner of elaborating statistical procedures, in order to tackle a wide variety of statistical problems such as confidence interval constructions, bootstrap, robust inference or extreme value statistics.



Title: Monotone Penalised Spline Smoothing

Speaker: A/Prof Turlach Berwin Ashoka, Department of Statistics and Applied Probability, National University of Singapore

Date: 16 April 2008 (Wednesday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

Penalised spline smoothing (Eilers and Marx, 1996; Ruppert and Carroll, 2000) is, arguably, fast becoming the method of choice for non- and semiparametric regression models. The attractiveness of penalised spline smoothers is twofold. First, compared with other smoothing methods, e.g. smoothing splines or kernel smoother, fitting penalised splines smoothers is computationally less complex.

Secondly, the connection between smoothing methods and mixed models (Speed, 1991) is particularly easy to establish for penalised spline smoothers. Thus, it is easy to incorporate a penalised spline smoother into a semiparametric regression model and fit the model using standard software available for fitting (linear) mixed models (Ruppert, Wand and Carroll, 2003).

However, in some situations, one would like to combine the flexibility of nonparametric smoothing techniques with prior knowledge in the form of constraints on the response curve given by, say, a physical or economic theory. In this talk, we discuss how monotonicity constraints can be imposed on penalised spline smoothers.



Title: Counting Without Sampling: Asymptotics of the Log-Partition Function

Speaker: Professor Antar Bandyopadhyay, Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, New Delhi Centre, New Delhi, India

Date: 02 April 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

In this talk we will propose new methods for computing the asymptotic value for the logarithm of the partition function for certain statistical physics models on certain type of finite graphs, as the size of the underlying graph goes to infinity. We will consider two models, namely the hard-core model when the activity parameter  is small, and the model for counting the number of proper q-colorings. And we will only consider the graphs with large girth. In particular, we will show that asymptotically the logarithm of the number of independent sets of any r-regular graph with large girth is constant, when r  5. For example, we will show that every 4-regular n-node graph with large girth has approximately (1.494...)n-many independent sets, for large n. Similarly we will prove that for every r-regular graph with r  2, with n nodes and large girth, the number of proper q  r + 1 colorings is approximately a constant (which can be explicitly written in terms of q and r), when n is large. Similar results also hold for random regular graphs.

As a byproduct of our method we will show that one can obtain some simple approximate counting algorithms for the problem of enumerating the number of independent sets, and proper colorings, in low degree graphs with large girth. These algorithms will be deterministic as opposed to Markov chain sampling schemes which are typically used in this context.

Our main approach will be to use a (strong) correlation decay property for the corresponding Gibbs measure (at certain parameter regime), along with a simple cavity trick which is well known in the physics literature.

(This is a joint work with David Gamarnik, Sloan School of Management, MIT).


Title: Semiparametric Regression And The Computer Science Interface

Speaker: Professor Matthew Wand, University of Wollongong, Australia

Date: 26 March 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

Semiparametric regression is concerned with flexible incorporation of nonlinear functional relationships in regression analyses, and is also the title of a 2003 book co-authored by the speaker. Examples of semiparametric regression include generalised additive models and additive mixed models for longitudinal data. The field has evolved almost entirely within the field of Statistics. In this talk we discuss semiparametric regression in light of the dissolving frontier between statistics and Computer Science. In particular we will discuss ways by which semiparametric regression can benefit from, and be beneficial to, Computer Science research.


Title: Statistical Estimation for Informatively Censored Survival Data

Speaker: Professor Zhang Wenyang, Department of Mathematical Sciences, University of Bath, UK

Date: 19 March 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

Partial likelihood estimation is a common used way to deal with the censored data. The vital assumption for partial likelihood estimation is the censoring is noninformative. However, sometimes, the censoring is indeed informative. One would pay price on efficiency of the obtained estimator if partial likelihood estimation is still used when the censoring is informative. In this talk, I will take a complete likelihood estimation approach, and appeal the local polynomial modelling to deal with the informatively censored survival data. Simulation studies show that the complete likelihood estimation approach indeed improves the efficiency of the estimator. Traditionally, in survival analysis, when complete likelihood estimation approach is taken the baseline function is usually modelled by least informative approach, see Fan and Gijbels (1996). While this approach is very appealing for estimating the coefficients, it doesn't seem working very well on estimating the baseline function itself.

The approach I take in this talk to deal with the baseline function is quite different to the traditional one, though based on local constant. I will show the baseline function can be estimated accurately by the proposed estimation method. I will also show that the directly local linear modelling would not work, the local constant modelling has to be conducted in an indirect way. Finally, I will use the proposed methods to analyse the second birth interval in Bangladesh, which leads to some interesting findings.



Title: Dealing with Spreadsheet Addiction

Speaker: Professor J. C. Nash, School of Management, University of Ottawa

Date: 05 March 2008 (Wednesday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

Many organizations and managers suffer a quiet addiction to spreadsheets. First turned on through easy availability, they typically get drawn into overuse through the attraction of cells that can be whatever they want them to be, designer macros, and charts in psychedelic colours. Yet all of these attractions hide the propensity of these uncontrolled programming environments to accidentally lose VERY large amounts of money, and to make it difficult or impossible to detect misreporting. Empirical studies demonstrate that the proportion of spreadsheets without serious errors is 0%
(Yes ZERO %).

It seems unlikely we can wrest the quick fix of spreadsheets from the grip of determined users. What, then, can we do to minimize the harm that spreadsheet addicts do to themselves and to their employers?

In particular, we will consider how to ensure: Enforceable audit trails; Better function management - and better functions; More rigorous testing methods; (These items will be expanded for a mathematical audience). Platform independence.

The speaker will present ideas arising from his involvement with two ongoing projects:

1) to provide tests of spreadsheet functions; and,

2) to offer audit trail and collaboration capability for spreadsheets and other office-suite software. He will also touch on a few of the many ideas and projects that have been presented at the European Spreadsheet Risks Interest Group conferences. Despite its name, EuSpRIG has a world-wide participation.



Title: Parameter Estimation and Bias Correction for Diffusion Processes

Speaker: Dr Tang Cheng Yong, Department of Statistics, Iowa State University

Date: 03 March 2008 (Monday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

This paper considers parameter estimation for continuous-time diffusion processes which are commonly used to model dynamics of financial securities including interest rates. To understand why the drift parameters are more difficult to estimate than the diffusion parameter as observed in many empirical studies, we develop expansions for the bias and variance of parameter estimators for two mostly employed interest rate processes. A parametric bootstrap procedure is proposed to correct bias in parameter estimation of general diffusion processes with a theoretical justification. Simulation studies confirm the theoretical findings and show that the bootstrap proposal can effectively reduce both the bias and the mean square error of parameter estimates for both univariate and multivariate processes. The advantages of using more accurate parameter estimators when calculating various option prices in finance are demonstrated by an empirical study on a Fed fund rate data.



Title: Reconstructing the Effect of Alternative Intervention Strategies on Historic Epidemics

Speaker: Dr Alex R. Cook, Department of Plant Sciences, University of Cambridge, England, UK

Date: 20 February 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

Data from historical epidemics provide a vital and sometimes under-used resource from which to devise strategies for future control of disease. Previous methods for retrospective analysis of epidemics, in which alternative interventions are compared, do not make full use of the information; by using only partial information on the historical trajectory, augmentation of control may lead to predictions of a paradoxical increase in disease. Here we introduce a novel statistical approach that takes full account of the available information in constructing the effect of alternative intervention strategies in historic epidemics. The key to the method lies in identifying a suitable mapping between the historic and notional outbreaks, under alternative control strategies. This is done by using the Sellke construction as a latent process linking epidemics. The application of the method is illustrated by two examples. First, using temporal data for the common human cold, the improvement under the new method in the precision of predictions for different control strategies is shown. Secondly, the generality of the method for retrospective analysis of epidemics is shown by applying it to a spatially-extended arboreal epidemic in which the relative effectiveness of host culling strategies that differ in frequency and spatial extent are compared. Some of the inferential and philosophical issues that arise are discussed along with the scope of potential application of the new method.



Title: Bayesian Hierarchical Modeling for Extreme Values Observed Over Space and Time
- Cancelled

Speaker: Dr Sang Huiyan, Department of Statistical Sciences, Duke University, Durham, NC

Date: 13 February 2008 (Wednesday)

Time: 2:00pm - 3:00pm

Venue:

Abstract

In this talk, I will begin with extreme value theory and a discussion on issues in modeling multivariate extremes. I will then present our hierarchical modeling approach for explaining a collection of spatially-referenced time series of extreme values. The univariate distributions of extreme values are extended to higher dimensions using latent multivariate Markov random field models specified through coregionalization, which allows the interpretation of high dimensional extreme value analysis including the nature of spatial association and the nature of temporal trend. By relaxing the assumption of conditional independence in the hierarchical models, we extend our approach to describe extreme values with a smoothed spatial process, which can be used in spatial interpolation with extremes.



Title: Statistical Issues that Arise in Modeling and Regulating Air Pollution Fields
- Joint Seminar With Institute for Mathematical Sciences (IMS)

Speaker: Professor Jim Zidek, Department of Statistics, University of British Columbia

Date: 23 January 2008 (Wednesday)

Time: 4:00pm - 5:00pm

Venue:

Abstract

The earth's atmosphere is a complex stochastic system which includes amongst other things pollution fields, a part of each deriving from anthropogenic sources and activities. Because of their negative health impacts, these fields are now subject to regulation.

However setting the air quality standards needed to regulate them is itself a complex business and that leads to a need for good models for these fields. This talk, drawing on the speaker's recent experience and research connected with ozone, will describe physical, computational and statistical approaches to modeling pollution fields and how these might be combined. Finally he will describe some of the ways in which the results of these models play into the process of developing standards. Although focussing on random pollution fields, the modeling issues have become quite pervasive in current research in statistical science.


Title: Structural Models of Corporate Bond Pricing with Maximum Likelihood Estimation - Joint Seminar With Risk Management Institute (RMI)

Speaker: Dr Hoi Ying Wong, The Chinese University of Hong Kong

Date: 16 January 2008 (Wednesday)

Time: 3:00pm - 4:00pm

Venue:

Abstract

Testing structural models of corporate bond pricing is equivalent to examining the performance of credit risk models in finance. This study empirically examines the proxy, volatility-restriction (VR) and maximum likelihood (ML) approaches to implementing structural corporate bond pricing models, and documents that ML estimation is the best among the three implementation methods. Empirical studies using either the proxy approach or the VR method conclude that barrier-independent models significantly underestimate corporate bond yields. Although barrier-dependent models tend to overestimate the yield on average, they generate a sizable degree of underestimation. The present work shows that the proxy approach is an upwardly biased estimator of the corporate assets and makes the empirical framework work systematically against structural models of corporate bond pricing. The VR approach may generate inconsistent corporate bond prices or may fail to give a positive corporate bond price for some structural models. When the Merton, LS, BD and LT models are implemented with ML estimation, we find substantial improvement in their performances. Our empirical analysis shows that the LT model is very accurate for predicting short-term bond yields, whereas the LS and BD models are good predictors for medium-term and long-term bonds. The Merton model however significantly overestimates short-term bond yields and underestimates long-term bond yields. Unlike empirical studies in the past, the Merton model implemented with ML estimation does not consistently underestimate corporate bond yields. This research gives an example in favor of using statistics in empirical finance rather than using a simplifying accounting rule and spells out the potential proxy risk in empirical studies.


Title: Statistical Inference for GARCH type Models

Speaker: Dr Chi Tim Ng, Timothy, Department of Statistics, College of National Sciences, Seoul National University Seoul, South Korea

Date: 08 January 2008 (Tuesday)

Time: 4:00pm - 5:00pm

Venue: S16-06-118 (Seminar Room)

Abstract

Since Engle's work, ARCH models have received considerable attention among economists and various types of generalizations to the ARCH models have been proposed. Among these models, those incorporating the notion of fractional-differencing and non-stationarity are the most interesting ones as they offered many challenging theoretical problems.

One commonly used technique to estimate the parameters in the ARCH type models is quasi-maximum likelihood estimation (QMLE). To establish the asymptotic properties of the QMLE, one usually has to impose stringent assumptions, see Robinson and Zaffaroni (2006) and Straumann (2005). They have to assume that a stationary solution to the true model exists and this solution has some finite moments. These two assumptions are too restrictive to be applied to non-stationary GARCH models exhibiting explosive behavior. Also, there are still controversies over the stationarity of the certain fractional-differencing models.

In this talk, I will give a brief review on the well-established results of stationary GARCH model and present new results of two generalized ARCH-type models, namely the non-stationary GARCH model (see Jensen and Rahbek, 2004) and the fractionally-integrated GARCH model (see Baillie, et al, 1996). The regularity conditions under which the strong consistency and asymptotic normality of the QMLE of the fractionally-integrated GARCH model hold are given in this presentation. In addition, the results of non-stationaryGARCH ($1,1$) models in Jensen and Rahbek (2004) will be extended to the general non-stationary GARCH ($p,q$) models.


Statistics and Applied Probability: Home | Search | Site Map | Contact Us

© Copyright 2001-04 National University of Singapore. All Rights Reserved.
Terms of Use | Privacy | Non-discrimination