## Virtual Seminar Series - Research and teaching in statistical and data sciences

This is the webpage for the Research and Teaching in statistical and data science seminars.

This diverse seminar series will highlight novel advances in methodology and application in statistics and data science, and will take the place of the University of Glasgow Statistics Group seminar during this period of remote working. We welcome all interested attendees at Glasgow and further afield.

Call details will be sent out 30mins before the start of the seminar

The dates of the seminars and speakers are as follows:

### PLEASE NOTE THIS SEMINAR HAS BEEN POSTPONED

28 May 2020, 1pm (BST)

Nicole Augustin (University of Edinburgh)

Title: Introduction of standardised tobacco packaging and minimum excise tax in the UK: a prospective study

Abstract:TStandardised packaging for factory made and roll your own tobacco was implemented in the UK in May, 2017, alongside a minimum excise tax for factory made products. As other jurisdictions attempt to implement standardised packaging, the tobacco industry continues to suggest that it would be counterproductive, in part by leading to falls in price due to commoditisation. Here, we assess the impact of the introduction of these policies on the UK tobacco market. We carried out a prospective study of UK commercial electronic point-of-sale data from 11 constituent geographic areas. Data were available for each tobacco product (or Stock Keeping Unit (SKU)): the tobacco brand, brand family, brand variant, specific features of the pack. For each SKU, three years (May 2015 to April 2018) of monthly data on volume of sales, sales prices, and extent of distribution of sales within the 11 UK geographical areas were available. The main outcomes were changes in sales volumes, volume-weighted real prices, and tobacco industry revenue. To estimate temporal trends of monthly price per stick, revenue and volume sold, we used additive mixed models. In the talk we will cover some of the statistically interesting details on data preparation, model choice, trend estimation and presentation of model results. We will also present the main results and talk about limitations. This is joint work with Rosemary Hiscock, Rob Branston and Anna Gilmore all at University of Bath.The project was funded by Cancer Research UK. .

### Future Seminars

18 June 2020, 2pm (BST)

Jo Eidsvik (NTNU)

Title: 'Autonomous Oceanographic Sampling Designs Using Excursion Sets for Multivariate Gaussian random fields'.

Abstract: Improving and optimizing oceanographic sampling is a crucial task for marine science and maritime management. Faced with limited resources to understand processes in the water-column, the combination of statistics and autonomous robotics provides new opportunities for experimental designs. In this work we develop methods for efficient spatial sampling applied to the mapping of coastal processes by providing informative descriptions of spatial characteristics of ocean phenomena. Specifically, we define a design criterion based on improved characterization of the uncertainty in the excursions of vector-valued Gaussian random fields, and derive tractable expressions for the expected Bernoulli variance reduction in such a framework. We demonstrate how this criterion can be used to prioritize sampling efforts at locations that are ambiguous, making exploration more effective. We use simulations to study the properties of methods and to compare them with state-of-the-art approaches, followed by results from field deployments with an autonomous underwater vehicle as part of a case study mapping the boundary of a river plume. The results demonstrate the potential of combining statistical methods and robotic platforms to effectively inform and execute data-driven environmental sampling.

9 July 2020, 2pm (BST)

Vianey Leos-Barajas (NCSU)

Title: TBC

Abstract: TBC

Date:TBC-Time TBC

Paul van Dam-Bates (University of St Andrews)

Title: Applications of the Halton Sequence for Spatially Balanced Sampling of Natural Resources

Abstract: We will demonstrate by example how some of the properties of the quasi-random Halton sequence can be used for flexible spatially balanced sampling of many different resource types. By introducing Halton boxes we expand Balanced Acceptance Sampling (BAS) for problems such as local sample replacement, double sampling and incorporating legacy sites. We will then show how Halton Iterative Partitioning can be used to select a Halton grid-based sample to create a freshwater master sample that coordinates monitoring at multiple spatial scales. Examples include marine monitoring in Western Canada, terrestrial and stream monitoring in New Zealand and lake monitoring in the Northwest Territories.

### Previous Seminars

23rd April 2020, 10am:

Neil Chada (National University of Singapore)

Title: Advancements of non-Gaussian random fields for statistical inversion

Abstract: Developing informative priors for Bayesian inverse problems is an important direction, which can help quantify information on the posterior. In this talk we introduce a new of a class priors for inversion based on $\alpha$-stable sheets, which incorporate multiple known processes such as a Gaussian and Cauchy process. We analyze various convergence properties which is achieved through different representations these sheets can take. Other aspects we wish to address are well-posedness of the inverse problem and finite-dimensional approximations. To complement the analysis we provide some connections with machine learning, which will allow us to use sampling based MCMC schemes. We will conclude the talk with some numerical experiments, highlighting the robustness of the established connection, on various inverse problems arising in regression and PDEs.

14th May 2020, 2pm

Title: Consensus clustering based on pivotal methods

Abstract: Despite its large use, one major limitation of K-means clustering algorithm is its sensitivity to the initial seeding used to produce the ﬁnal partition. We propose a modiﬁed version of the classical approach, which exploits the information contained into a co-association matrix obtained from clustering ensembles. Our proposal is based on the identiﬁcation of a set of data points–pivotal units–that are representative of the group they belong to. The presented approach can thus be viewed as a possible strategy to perform consensus clustering. The selection of pivotal units has been originally employed for solving the so-called label-switching problem in Bayesian estimation of ﬁnite mixture models. Diﬀerent criteria for identifying the pivots are discussed and compared. We investigate the performance of the proposed algorithm via simulation experiments and the comparison with other consensus methods available in the literature.

21st May 2020, 2pm

Ana Basiri (UCL)

Title: Who Are the "Crowd"? Learning from Large but Patchy Samples

Abstract:This talk will look at the challenges of crowdsourced/self-reporting data, such as missingness and biases in ‘new forms of data’ and consider them as a useful source of data itself. A few applications and examples of these will be discussed, including extracting the 3D map of cities using the patterns of blockage, reflection, and attenuation of the GPS signals (or other similar signals), that are contributed by the volunteers/crowd. In the era of big data, open data, social media and crowdsourced data when “we are drowning in data”, gaps and unavailability, representativeness and bias issues associated with them may indicate some hidden problems or reasons allowing us to understand the data, society and cities better.

This seminar series is supported as part of the ICMS/INI Online Mathematical Sciences Seminars.