Theses and Dissertations (Statistics)

Permanent URI for this collectionhttp://hdl.handle.net/2263/32483

Browse

Now showing 1 - 20 of 133

Explainable Bayesian networks : taxonomy, properties and approximation methods
(University of Pretoria, 2024-07-22) De Waal, Alta; inekederks1@gmail.com; Derks, Iena Petronella
Technological advances have integrated artificial intelligence (AI) into various scientific fields, necessitating understanding AI-derived decisions. The field of explainable artificial intelligence (XAI) has emerged to address transparency concerns, offering both transparent models and post-hoc explanation techniques. Recent research emphasises the importance of developing transparent models, with a focus on enhancing the interpretability of these models. An example of a transparent model that would benefit from enhanced post-hoc explainability is Bayesian networks. This research investigates the current state of explainability in Bayesian networks. Literature includes three categories of explanation: explanation of the model, reasoning, and evidence. Drawing upon these categories, we formulate a taxonomy of explainable Bayesian networks. Following this, we extend the taxonomy to include explanation of decisions, an area recognised as neglected within the broader XAI research field. This includes using the same-decision probability, a threshold-based confidence measure, as a stopping and selection criteria for decision-making. Additionally, acknowledging computational efficiency as a concern in XAI, we introduce an approximate forward-gLasso algorithm as a solution for efficiently solving the most relevant explanation. We compare the proposed algorithm with a local, exhaustive forward search. The forward-gLasso algorithm demonstrates accuracy comparable to the forward search while reducing the average neighbourhood size, leading to computationally efficient explanations. All coding was done in R, building on existing packages for Bayesian networks. As a result, we develop an open-source R package capable of generating explanations of evidence for Bayesian networks. Lastly, we demonstrate the practical insights gained from applying post-hoc explanations on real-world data, such as the South African Victims of Crime Survey 2016 - 2017.
Hypersphere candidates emanating from the Dirichlet and its extension
(University of Pretoria, 2024-07) Makgai, Seitebaleng; Bekker, Andriette, 1958-; u18243020@tuks.co.za; Leshilo, Ramadimetje Lethabo
Compositional datasets consist of observations that are proportional and are subject to non-negativity and unit-sum constraints. These datasets arise naturally in a multiplicity of fields such as agriculture, archaeology, economics, geology, health sciences, and psychology. There is a strong footprint in the literature on the Dirichlet distribution for modelling compositional datasets, followed by several generalizations of the Dirichlet distribution, with more flexible structures. In this study, we consider a transformation of two Dirichlet-type random variables W1,W2, . . . ,Wm by applying the square-root transformation Xi = √Wi for i = 1, 2, . . . , m. With this square-root transformation, we propose and develop a new distribution that is defined on the positive orthant of the hypersphere, that accommodates both positive and negative covariance structure. This novel model is a flexible offering to the spherical-Dirichlet models. We perform several simulation studies for the proposed model. The maximum likelihood is used for parameter estimation. Two applications of the models to biological and archaeological compositional datasets are presented, to illustrate the flexibility of the proposed model.
Essays on estimation strategies addressing label-switching in Gaussian mixtures of semi- and non-parametric regressions
(University of Pretoria, 2024-04-30) Millard, Sollie M.; Kanfer, F.H.J. (Frans); spiwe.skhosana@up.ac.za; Skhosana, Sphiwe Bonakele
Gaussian mixtures of non-parametric regressions (GMNRs) are a flexible class of Gaussian mixtures of regressions (GMRs). These models assume that some or all of the parameters of GMRs are non-parametric functions of the covariates. This flexibility gives these models wide applicability for studying the dependence of one variable on one or more covariates when the underlying population is made up of unobserved subpopulations. The predominant approach used to estimate the GMRs model is maximum likelihood via the Expectation-Maximisation (EM) algorithm. Due to the presence of non-parametric terms in GMNRs, the model estimation poses a computational challenge. A local-likelihood estimation of the non-parametric functions via the EM algorithm may be subject to label-switching. To estimate the non-parametric functions, we have to define a local-likelihood function for each local grid point on the domain of a covariate. If we separately maximise each local-likelihood function, using the EM algorithm, the labels attached to the mixture components may switch from one local grid point to the next. The practical consequence of this label-switching is characterised by non-parametric estimates that are non-smooth, exhibiting irregular behaviour at local points where the switch took place. In this thesis, we propose effective estimation strategies to address label-switching. The common thread that underlies the proposed strategies is the replacement of the separate maximisations of the local-likelihood functions with simultaneous maximisation. The effectiveness of the proposed methods is demonstrated on finite sample data using simulations. Furthermore, the practical usefulness of the proposed methods is demonstrated through applications on real data.
Multiscale decomposition of spatial lattice data for hotspot prediction
(University of Pretoria, 2023-11-27) Fabris-Rotelli, Inger Nicolette; Chen, Ding-Geng (Din); rene.stander@up.ac.za; Stander, René
Being able to identify areas with potential risk of becoming a hotspot of disease cases is important for decision makers. This is especially true in the case such as the recent COVID-19 pandemic where it was needed to incorporate prevention strategies to restrain the spread of the disease. In this thesis, we first extend the Discrete Pulse Transform (DPT) theory for irregular lattice data as well as consider its efficient implementation, the Roadmaker's Pavage algorithm (RMPA), and visualisation. The DPT was derived considering all possible connectivities satisfying the morphological definition of connection. Our implementation allows for any connectivity applicable for regular and irregular lattices. Next, we make use of the DPT to decompose spatial lattice data along with the multiscale Ht-index and the spatial scan statistic as a measure of saliency on the extracted pulses to detect significant hotspots. In the literature, geostatistical techniques such as Kriging has been used in epidemiology to interpolate disease cases from areal data to a continuous surface. Herein, we extend the estimation of a variogram to spatial lattice data. In order to increase the number of data points from only the centroids of each spatial unit (representative points), multiple points are simulated in an appropriate way to represent the continuous nature of the true underlying event occurrences more closely. We thus represent spatial lattice data accurately by a continuous spatial process in order to capture the spatial variability using a variogram. Lastly, we incorporate the geographically and temporally weighted regression spatio-temporal Kriging (GTWR-STK) method to forecast COVID-19 cases to a next time step. The GTWR-STK method is applied to spatial lattice data where the spatio-temporal variogram is estimated by extending the proposed variogram for spatial lattice data. Hotspots are predicted by applying the proposed hotspot detection method to the forecasted cases.
Enhancing spatial image analysis : modelling perspectives on the usefulness of level-sets
(University of Pretoria, 2024-03) Fabris-Rotelli, Inger Nicolette; Loots, Mattheus Theodor; u15002536@tuks.co.za; Stander, Jean-Pierre
This thesis presents a comprehensive exploration of level-sets applied to various stages of image analysis, aiming to enhance understanding, modelling, and interpretability of image data. The research focuses on three critical aspects namely, data cleaning, data modelling, and explainability. In data cleaning, the adaptive median filter is a commonly used technique removing noise from images which compares individual pixels to an adaptive window around it. Herein the adaptive median filter is improved by acting on level-sets rather than individual pixels. The proposed level-sets adaptive median filter demonstrates effective noise removal while preserving edges in the images better than the traditional adaptive median filter. Secondly, this work considers representing images as graphical models, with the nodes corresponding to the fuzzy level-sets of the images. This novel representation successfully preserves and maps critical image information required for understanding of image context in a binary classification scenario. Further, this representation is used to propose a novel method for modelling images, which enables inference to be applied on image content directly. Finally, within the realm of deep learning object detection saliency maps, the detector randomised input sampling for explanation (D-RISE) is extended using informative level set sampling. A key, yet computationally expensive, component of the former is the generation of a suitable number of masks. The proposed methodology in this work, namely the adaptive D-RISE, harnesses proportional level-sets sampling of masks to reduce the required number of masks and improves the convergence of attribution.
New characterisations of spatial linear networks for geographical accessibility
(University of Pretoria, 2024-02-13) Fabris-Rotelli, Inger Nicolette; Debba, Pravesh; Cleghorn, Christopher W; renate.thiede@up.ac.za; Thiede, Renate Nicole
Target 9.1 of the United Nations Sustainable Development Goals specifies the need for affordable, equitable access for all. In South Africa, where most travel occurs via the road network, apartheid policies designed the historical road network to segregate rather than integrate. Since the end of apartheid, there has been an increased need for integrated urban accessibility. Since government initiatives are typically enacted at a regional level, it is relevant to model accessibility between regions. Very few methods exist in the literature that model road-based inter-regional accessibility, and none account for structural characteristics of the road network. The aim of this thesis is to develop a novel stochastic model that estimates road-based inter-regional accessibility, and that is able to take the homogeneity of road networks into account. The accessibility model utilises Markov chain theory. Each region represents a state, and the average inverse distances between regions act as transition probabilities. Transition probabilities between adjacent regions are stored in a 1-step transition probability matrix (TPM). Assuming the Markov property holds, raising the TPM to the power n gives transition probabilities between regions up to n steps away. Letting n→∞ gives the prominence index, which quantifies the accessibility of a region regardless of the journey’s starting point. Road network homogeneity is tested by extending a test for the homogeneity of spatial point patterns to spatial linear networks. An unsupervised clustering method is then developed which subdivides a road network into regions that are as homogeneous as possible. Finally, road-based accessibility is calculated between these regions. The accessibility model was first applied to electoral wards in the City of Tshwane. Based on the wards, the central business district (CBD) was most accessible, but there was poor accessibility to the CBD from outlying townships. The homogeneity test showed that distinct residential neighbourhoods were internally homogeneous, and was thus able to identify neighbourhoods within a road network. The unsupervised clustering method was then used to identify two new regionalisations of the road network within the City of Tshwane at different spatial scales, and the accessibility model was applied to these regionalisations. For one regionalisation, an emerging economic area was most accessible, while for the other, a central educational area was most accessible. Although accessibility was not correlated with road network homogeneity, different spatial scales and regionalisations had a great impact on the accessibility results. This thesis develops a new characterisation of spatial linear networks based on their homogeneity, and uses this to investigate the state of inter-regional road-based accessibility in the City of Tshwane. This is a crucial area of research in the move towards a more equitable and sustainable future.
Spatial-temporal topic modelling of COVID-19 tweets in South Africa
(University of Pretoria, 2023-12-07) Mazarura, Jocelyn; Fabris-Rotelli, Inger Nicolette; u18073159@tuks.co.za; Jafta, Papama Hlumela Gandhi
In the era of social media, the analysis of Twitter data has become increasingly important for understanding the dynamics of online discourse. This research introduces a novel approach for tracking the spatial and temporal evolution of topics in Twitter data. Leveraging the spatial and temporal labels provided by Twitter for tweets, we propose the Clustered Biterm Topic Model. This model combines the Biterm Topic Model with K-medoid clustering to uncover the intricate topic development patterns over space and time. To enhance the accuracy and applicability of our model, we introduce an innovative element: a covariate-dependent matrix. This matrix incorporates essential covariate information and geographic proximity into the dissimilarity matrix used by K-Medoids clustering. By considering the inherent semantic relationships between topics and the contextual information provided by covariates and geographic proximity, our model captures the complex interplay of topics as they emerge and evolve across different regions and timeframes on Twitter. The proposed Clustered Biterm Topic Model offers a robust and versatile tool for researchers, policymakers, and businesses to gain deeper insights into the dynamic landscape of online conversations, which are inherently shaped by space and time.
A robust simulation to compare meaningful batting averages in cricket
(University of Pretoria, 2023-11-17) Van Staden, Paul J.; Fabris-Rotelli, Inger Nicolette; u17150818@tuks.co.za; Vorster, Johannes S.
In cricket, the traditional batting average is the most common measure of a cricket player’s batting performance. However, the batting average can easily be inflated by a high number of not-out innings. Therefore, in this research eight alternative methods are used and compared to the traditional batting average to estimate the true batting average. It is also known that there is a range of different batters within a cricket team, namely first order, middle order, tail-enders and a special class of players who can both bat and bowl known as all-rounders. There are also different formats of international cricket, namely Test, One-Day International (ODI), and Twenty20 International (T20I) cricket, where Test cricket has unlimited overs compared to the limited overs of ODI and T20I cricket. A method for estimating the batting average should be able to account for this variability. The chosen method should also work for a player’s career as well as a short series or tournament. By using the traditional bootstrap and the smoothed bootstrap in this study, the variability of each estimation method is compared for a player’s career and a series or tournament, respectively. An R Shiny application introduces alternative cricket batting performance measures, enabling accessible analysis beyond the conventional average for a comprehensive understanding of player capabilities.
A mixture model approach to extreme value analysis of heavy tailed processes
(University of Pretoria, 2023-12-07) Maribe, Gaonyalelwe; Kanfer, Frans; Millard, Sollie; lizosanqela@gmail.com; Sanqela, Lizo
Extreme value theory (EVT) encompasses statistical tools for modelling extreme events, which are defined in the peaks-over-threshold methodology as excesses over a certain high threshold. The estimation of this threshold is a crucial problem and an ongoing area of research in EVT. This dissertation investigates extreme value mixture models which bypass threshold selection. In particular, we focus on the Extended Generalised Pareto Distribution (EGPD). This is a model for the full range of data characterised by the presence of extreme values. We consider the non-parametric EGPD based on a Bernstein polynomial approximation. The ability of the EGPD to estimate the extreme value index (EVI) is investigated for distributions in the Frechet, Gumbel and Weibull domains through a simulation study. Model performance is measured in terms of bias and mean squared error. We also carry out a case study on rainfall data to illustrate how the EGPD fits as a distribution for the full range of data. The case study also includes quantile estimation. We further propose substituting the Pareto distribution, in place of the GPD, as the tail model of the EGPD in the case of heavy-tailed data. We give the mathematical background of this new model and show that it is a member of the EGPD family and is thus in compliance with EVT. We compare this new model's bias and mean squared error in EVI estimation to the old EGPD through a simulation study. Furthermore, the simulation study is extended to include other estimators for Frechet-type data. Moreover, a case study is carried out on the Belgian Secura Re data.
An interactive R shiny application for learning multivariate data analysis and time series modelling
(University of Pretoria, 2024-02-07) Salehi, Mahdi; Bekker, Andriette, 1958-; Arashi, Mohammad; francesmotala@gmail.com; Frances, Motala Charles
Multivariate analysis and time series modelling are essential data analysis techniques that provide a comprehensive approach for understanding complex datasets and supporting data-driven decision-making. Multivariate analysis involves the simultaneous examination of multiple variables, enabling the exploration of intricate relationships, dependencies, and patterns within the data. Time series modelling, on the other hand, focuses on data evolving over time, facilitating the detection of trends, seasonal patterns, and forecasting future values. In addition to the multivariate and time series analysis techniques, we expand our focus to include machine learning, a field dedicated to developing algorithms and models for data-driven predictions and decisions. The primary contribution of this dissertation is the development of an innovative R Shiny application known as the Advanced Modelling Application (AM application). The AM application revolutionizes multivariate analysis, machine learning, and time series modelling by bridging the gap between complexity and usability. With its intuitive interface and advanced statistical techniques, the application empowers users to explore intricate datasets, discover hidden patterns, and make informed decisions. Interactive visualizations and filtering capabilities enable users to identify correlations, dependencies, and influential factors among multiple variables. Moreover, the integration of machine learning algorithms empowers users to leverage predictive analytics, allowing for the creation of robust models that uncover latent insights within the data and make accurate predictions for informed decision-making. Additionally, the application incorporates state-of-the-art algorithms for time series analysis, simplifying the analysis of temporal patterns, forecasting future trends, and optimizing model parameters. This ground-breaking tool is designed to unlock the full potential of data, enabling users to drive impactful outcomes.
Breaking the norm : approaches for symmetric, positive, and skewed data
(University of Pretoria, 2023-11-06) Bekker, Andriette, 1958-; Arashi, Mohammad; matthias@dilectum.co.za; Wagener, Matthias
This research contributes to the advancement of flexible and interpretable models within distribution theory, which is a fundamental aspect of numerous academic disciplines. This study investigates and presents the derivative-kernel approach for extending distributions. This method yields new distributions for symmetric, skew, and positive data, making it applicable for a wide range of modelling tasks. These newly derived distributions enhance the normal and gamma distributions by incorporating easily interpretable and identifiable parameters while retaining tractable mathematical properties. Furthermore, these models have a solid statistical foundation for simulation and prediction through stochastic representations. Additionally, these models demonstrate proficient flexibility and modelling performance when applied to real data. The introduced skew distribution presents a new skewing mechanism that combines the best features of current leading methods. Consequently, this leads to improved accuracy and flexibility when modelling skewed data patterns. In today's rapidly evolving data landscape, with increasingly intricate data structures, these advancements provide vital tools for effectively interpreting and analysing diverse data patterns encountered in economics, psychology, engineering, and biology.
Bayesian learning of regularized Gaussian graphical networks
(University of Pretoria, 2024) Arashi, Mohammad; Bekker, Andriette, 1958-; u14016665@tuks.co.za; Smith, Jarod Mark
The advancement of digitisation in various scientific disciplines has generated data with numerous variables. Gaussian graphical models (GGMs) offer a convenient framework for analysing and interpreting the conditional relationships among these variables, with network inference relying on estimating the precision matrix within a multivariate Gaussian framework. Two novel Bayesian shrinkage methods are proposed for the estimation of the precision matrix. The first develops a Bayesian treatment of the frequentist alternative ridge precision estimator with the common l2 penalty, allowing for networks that are not necessarily highly sparse. The second caters for diverse sparsity by enabling both l1 and l2 based shrinkage within a naïve elastic net setting. Full block Gibbs samplers are provided for implementing the new estimators. The Bayesian graphical ridge and naïve elastic net priors are extended to allow for flexible shrinkage of the off-diagonal elements of the precision matrix. Simulations and practical case studies show that the proposed estimators compare favourably with competing methods and enrich methodological flexibility for data analysis. To this end, a Bayesian approach for estimating differential networks (DN), using the Bayesian adaptive graphical lasso, is introduced. Comparisons to state-of-the-art frequentist techniques highlight the utility of the proposed technique. The novel samplers considered are available in the ’baygel’ R package to facilitate usage and exploration for practitioners.
The theory and application of bootstrap control charts for statistical process control
(University of Pretoria, 2017-05) Graham, Marien Alet; Coetzee, Evert Johan
Chapter 1 of this mini-dissertation gives an introduction to Statistical Process Control (SPC) and provides some background on the Shewhart, CUSUM and the EWMA control charts. The bootstrap by [9] Efron (1979) is discussed and a brief overview of Phase I and Phase II analysis is given. The chapter concludes with the research objectives of this dissertation. Chapter 2 of this dissertation provides a literature review of bootstrap Shewhart, cumulative sum (CUSUM), exponentially weighted moving average (EWMA) and multivariate control charts. The Shewhart-type control charts mostly focus on the bootstrap procedures proposed by [2] Bajgier (1992), [34] Seppala, Moskowitz, Plante and Tang (1995) and [23] Liu and Tang (1996). An overview of the bootstrap CUSUM charts proposed by [7] Chatterjee and Qiu (2009) and [1] Ambartsoumian and Jeske (2015) is given. A review of the parametric bootstrap control chart used by [33] Saleh, Mahmoud, Jones-Farmer, Zwetsloot and Woodal (2015) to construct EWMA control charts is given. The chapter concludes with a review of the bootstrap T2 control chart proposed by [32] Phaladiganon, Kim, Chen, Baek and Park (2011). In Chapter 3 the design of a potential nonparametric bootstrap EWMA control is given. The chapter concludes with two examples of how the control limits for such a chart can be constructed for two different statistics. Chapter 4 of this mini-dissertation examines conditional in-control (IC) and out-of-control (OOC) average run-length, for the chart proposed in Chapter 3, taking different underlying process distributions into consideration. In Chapter 5 the the mini-dissertation is concluded by summarising the research that has been done and providing recommendations for further research.
Construction and parameter estimation of wrapped normal models
(University of Pretoria, 2019-08) Loots, Mattheus Theodor; Bekker, Andriette, 1958-; hannalineroux@gmail.com; Roux, Hannaline
If a known distribution on a real line is given, it can be wrapped on the circumference of a unit circle. This research entails the study of a univariate skew-normal distribution where the skew-normal distribution is generalised for the case of bimodality. Both the skew-normal and exible generalised skew-normal distributions are wrapped onto a unit circle, consequently referred to as a wrapped skew-normal and a wrapped exible generalised skew-normal distribution respectively. For each of these distributions a simulation study is conducted, where the performance of maximum likelihood estimation is evaluated. Skew scale mixtures of normal distributions with the wrapped version of these distributions are proposed and graphical representations are provided. These distributions are also compared in an application to wind direction data.
Lower quantile estimation within an artificially censored framework
(University of Pretoria, 2020) Bekker, Andriette, 1958-; Ferreira, Johan T.; jarodsmith706@gmail.com; Smith, Jarod
Quantile estimation is a vital aspect of statistical analyses in a variety of fields. For example, lower quantile estimation is crucial to ensure the safety and reliability of wood-built structures. Various statistical tech-niques, which include parametric, non-parametric and mixture modelling are available for estimation of lower quantiles. An intuitive approach would be to consider models that ˝t the tail of the sample instead of the entire range. Quantiles of interest can be estimated by arti˝cially censoring observations beyond a chosen threshold. The choice of threshold is crucial to ensure e°cient and unbiased quantile estimates, and usually the 10th empirical percentile is chosen as the threshold. [16] proposes a bootstrap approach in order to ob-tain a better threshold for the censored Weibull MLE, however, this approach is computationally expensive. A new threshold selection technique is proposed that makes use of a standardised-weighted adjusted trun-cated Kolmogorov-Smirnov test (SWAKS-MLE). The SWAKS-MLE outperforms in the bootstrap threshold censored Weibull MLE method, in addition to being vastly less computationally intensive.
A functional approach to distribution modelling : the spliced generalised normal distribution
(University of Pretoria, 2019) Bekker, Andriette, 1958-; Arashi, Mohammad; Naderi, M.; matthias@dilectum.co.za; Wagener, Matthias
A new body and tail generalisation of the normal distribution is introduced, the spliced generalised normal (SGN). A special case of the SGN, the tail-adjusted normal distribution, is further generalised with two-piece scaling to accommodate di erent combinations of skewness and tail weight in data. The two-piece scaled tail-adjusted normal (TPTAN) is thoroughly studied with the derivations of various statistical properties such as the probability density function, cumulative distribution function, quantile function, moments, and Fischer information. The applicability of the SGN distribution is demonstrated by the application of the TPTAN to light and heavy-tailed data sets. The small and large sample performance of the TPTAN is investigated with an extensive simulation study. The methods of estimation include maximum likelihood and Kolmogorov-Smirnov estimation. The goodness of t is evaluated by likelihood criteria and hypothesis tests such as Akaike information criterion, Bayesian information criterion, consistent Akaike information criterion, Hannan-Quinn information criterion, and the KS and Bayes factor tests.
Distributional properties of ratios of gamma random variables in the context of quality control
(University of Pretoria, 2016) Human, Schalk William; Bekker, Andriette, 1958-; Mijburgh, Philip Albert
This study emanates from a practical problem in the statistical process control (SPC) environment where the quality of a process is monitored. Speci cally, where the variance of a process is being assessed to be the same for all samples. In the traditional SPC environment the parameters of the underlying manufacturing process are usually assumed to be known. If, however, they are not known, they need to be estimated. Estimating these parameters and using them in control charts has many associated problems, especially when the samples that are used to calculate the estimates contain few data points. This study proposes a new control chart that is used to detect a shift in the process's variance, but that does not directly rely on parameter estimates, and as such overcomes many of these problem. The development of this newly proposed control chart gives rise to a new beta type distribution. An overview of the problem statement as identi ed in the eld of SPC is given and the newly developed beta type distribution is proposed. Some statistical properties of this distribution are studied and the e ect of di erent parameter choices on the shape of the distribution are investigated, with the focus speci cally on the bivariate case. Through simulation, a comparison study is also performed, comparing the newly proposed model with a generalised version of the Q chart model, which was studied in depth by Adamski (2014).
Developments in Wishart ensemble and Bayesian application
(University of Pretoria, 2017) Bekker, Andriette, 1958-; Arashi, Mohammad; Van Niekerk, Janet
The increased complexity and dimensionality of data necessitates the development of new models that can adequately model the data. Advances in computational approaches have pathed the way for consideration and implementation of more complicated models, previously avoided due to practical difficulties. New models within theWishart ensemble are developed and some properties are derived. Algorithms for the practical implementation of these matrix variate models are proposed. Simulation studies and real datasets are used to illustrate the use and improved performance of these new models in Bayesian analysis of the multivariate and univariate normal models. From this speculative research study the following papers emanated: 1. J. Van Niekerk, A. Bekker, M. Arashi, and J.J.J. Roux (2015). “Subjective Bayesian analysis of the elliptical model”. In: Communications in Statistics - Theory and Methods 44.17, 3738–3753 2. J. Van Niekerk, A. Bekker, M. Arashi, and D.J. De Waal (2016). “Estimation under the matrix variate elliptical model”. In: South African Statistical Journal 50.1, 149–171 3. J. Van Niekerk, A. Bekker, and M. Arashi (2016). “A gamma-mixture class of distributions with Bayesian application”. In: Communications in Statistics - Simulation and Computation (Accepted) 4. M. Arashi, A. Bekker, and J. Van Niekerk (2017). “Weighted-type Wishart distributions with application”. In: Revstat 15(2), 205–222 5. A. Bekker, J. Van Niekerk, and M. Arashi (2017). “Wishart distributions - Advances in Theory with Bayesian application”. In: Journal of Multivariate Analysis 155, 272–283
Product of independent generalised gamma random variables
(University of Pretoria, 2016-10) Bekker, Andriette, 1958-; Marques, Filipe; Bilankulu, Vusi Raphael
The generalised gamma distribution has received much attention due to its exibility and also for having some well-known distributions as special cases. This study originates from a statistic de ned as the ratio of products of independent generalised gamma random variables and shows that it can be represented as the product of independent generalised gamma random variables with some re-parametrisation. By decomposing the character- istic function of the negative logarithm of the statistic and then using the distribution of the di¤erence of two independent generalized integer gamma random variables as a basis, accurate and computationally appealing near-exact distributions are derived for the statis- tic. In the process, a new exible parameter is introduced in the near-exact distributions which allows to control the degree of precision of these approximations. Furthermore, the performance of the near-exact distributions is assessed using a measure of proximity be- tween cumulative distribution functions; also, by comparison with the exact distribution, empirical distribution and with an approximation developed using a di¤erent method and which can only be applied in some particular cases.
Contributions to k-u models in wireless systems
(University of Pretoria, 2017-06) Bekker, Andriette, 1958-; Arashi, Mohammad; Nagar, Priyanka
Please read abstract in the document

Browse

Recent Submissions