"This site requires JavaScript to work correctly"

- Maschinelles Lernen
- Datenanalyse / Statistik

Professor

Studienfachberater BA Künstliche Intelligenz

nach Vereinbarung

Sortierung:

Vortrag

- Robert Hable

KI in der Produktion – Methoden, Praxisbeispiele, Strategien

In: Das Webinar am Freitag (Bayern Innovativ - Bayerische Gesellschaft für Innovation und Wissenstransfer mbH)

Online

- 09.12.2022 (2022)

- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Fragebögen erstellen und auswerten - (k)eine Wissenschaft für sich

In: Digitale Methoden in der Forschung

Netzwerk Internet und Digitalisierung Ostbayern Online

- 02.05.2022 (2022)

- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

KI in der Produktion – Methoden, Praxisbeispiele, Strategien

In: Das Webinar am Freitag

Cluster Mechatronik & Automation Online

- 01.04.2022 (2022)

- Angewandte Informatik
- DIGITAL

Zeitschriftenartikel

- Michael Heigl
- Kumar Anand
- Andreas Urmann
- D. Fiala
- Martin Schramm
- Robert Hable

On the Improvement of the Isolation Forest Algorithm for Outlier Detection with Streaming Data

In: Electronics vol. 10 pg. 1534.

- (2021)

DOI: 10.3390/electronics10131534

In recent years, detecting anomalies in real-world computer networks has become a more and more challenging task due to the steady increase of high-volume, high-speed and high-dimensional streaming data, for which ground truth information is not available. Efficient detection schemes applied on networked embedded devices need to be fast and memory-constrained, and must be capable of dealing with concept drifts when they occur. Different approaches for unsupervised online outlier detection have been designed to deal with these circumstances in order to reliably detect malicious activity. In this paper, we introduce a novel framework called PCB-iForest, which generalized, is able to incorporate any ensemble-based online OD method to function on streaming data. Carefully engineered requirements are compared to the most popular state-of-the-art online methods with an in-depth focus on variants based on the widely accepted isolation forest algorithm, thereby highlighting the lack of a flexible and efficient solution which is satisfied by PCB-iForest. Therefore, we integrate two variants into PCB-iForest—an isolation forest improvement called extended isolation forest and a classic isolation forest variant equipped with the functionality to score features according to their contributions to a sample’s anomalousness. Extensive experiments were performed on 23 different multi-disciplinary and security-related real-world datasets in order to comprehensively evaluate the performance of our implementation compared with off-the-shelf methods. The discussion of results, including AUC, F1 score and averaged execution time metric, shows that PCB-iForest clearly outperformed the state-of-the-art competitors in 61% of cases and even achieved more promising results in terms of the tradeoff between classification and computational costs.

- Elektrotechnik und Medientechnik
- Institut ProtectIT
- DIGITAL

Vortrag

- Robert Hable

KI in der Produktion – Methoden, Praxisbeispiele, Strategien

In: Das Webinar am Freitag

Cluster Mechatronik & Automation Online

- 11.06.2021 (2021)

- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Best Practice Projekt: Internationales Big Data Zentrum Ostbayern-Südböhmen

In: INTERREG VI-A: Bayern-Österreich & Bayern-Tschechien 2021 – 2027: Möglichkeiten für Hochschulen in der Europaregion Donau-Moldau

Europaregion Donau-Moldau e.V. Online

- 08.10.2021 (2021)

- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Künstliche Intelligenz – was bringt es dem Mittelstand?

In: Online-Vortragsreihe Bayern - Tschechien

EUREGIO Bayerischer Wald - Böhmerwald - Unterer Inn Online

- 30.03.2021 (2021)

- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Fitting Additive Models With Regularized Kernel Methods: Methodology, Robustness Properties, and Business Applications

In: DAGStat Conference 2019

Ludwig-Maximilians-Universität München

- 20.03.2019 (2019)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Vortrag

- Robert Hable

Maschinelles Lernen erfolgreich nutzen

In: DigiCamp zum Thema "4.0 braucht Künstliche Intelligenz (KI) - Praktische KI erleben"

Technische Hochschule Deggendorf Deggendorf

- 16.01.2019 (2019)

- TC Grafenau
- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Big Data in der Produktion

Cluster Mechatronik Akademie Regensburg

- 27.11.2018 (2018)

- TC Grafenau
- Angewandte Informatik
- DIGITAL

Vortrag

- Robert Hable

Statistische Daten in der Praxis: Zeitverschwendung oder Goldgrube?

Bayerisches Landwirtschaftsministerium München

- 20.02.2018 (2018)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Vortrag

- Robert Hable

BIG DATA in der Großküche - Was steckt dahinter, wo liegen die Chancen und Risiken?

In: Jahresmitgliederversammlung des HKI Industrieverbands

Berlin

- 16.04.2018 (2018)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Vortrag

- Robert Hable

What Machine Learning Can Do

In: TechDays

München

- 14.05.2018 (2018)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Zeitschriftenartikel

- Nari Arunraj
- Robert Hable
- Michael Fernandes
- Karl Leidl
- Michael Heigl

Comparison of Supervised, Semi-supervised and Unsupervised Learning Methods in Network Intrusion Detection Systems (NIDS) Application

In: Anwendungen und Konzepte in der Wirtschaftsinformatik (AKWI) pg. 10-19.

- (2017)

With the emergence of the fourth industrial revolution (Industrie 4.0) of cyber physical systems, intrusion detection systems are highly necessary to detect industrial network attacks. Recently, the increase in application of specialized machine learning techniques is gaining critical attention in the intrusion detection community. A wide variety of learning techniques proposed for different network intrusion detection system (NIDS) problems can be roughly classified into three broad categories: supervised, semi-supervised and unsupervised. In this paper, a comparative study of selected learning methods from each of these three kinds is carried out. In order to assess these learning methods, they are subjected to investigate network traffic datasets from an Airplane Cabin Demonstrator. In addition to this, the imbalanced classes (normal and anomaly classes) that are present in the captured network traffic data is one of the most crucial issues to be taken into consideration. From this investigation, it has been identified that supervised learning methods (logistic and lasso logistic regression methods) perform better than other methodswhen historical data on former attacks are available. The results of this study have also showed that the performance of semi-supervised learning method (One class support vector machine) is comparatively better than unsupervised learning method (Isolation Forest) when historical data on former attacks are not available.

- TC Teisnach Sensorik
- TC Grafenau
- Institut ProtectIT
- DIGITAL

Vortrag

- Robert Hable

Prognosen in Unternehmen: Praxisbeispiele und Handlungsempfehlungen

In: Prognosekonferenz

Grafenau

- 06.11.2017 (2017)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Vortrag

- Robert Hable

Big Data - erste Schritte wagen! Praxisbeispiele und Handlungsempfehlungen für Unternehmen

In: Sensorik Symposium

Strategische Partnerschaft Sensorik e.V. Regensburg

- 27.09.2017 (2017)

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Zeitschriftenartikel

- C. Dupke
- C. Bonenfant
- B. Reineking
- Robert Hable
- T. Zeppenfeld
- M. Ewald
- M. Heurich

Habitat selection by a large herbivore at multiple spatial and temporal scales is primarily governed by food resources

In: Ecography - Pattern and Process in Ecology vol. 40 pg. 1014-1027.

- (2017)

DOI: 10.1111/ecog.02152

Habitat selection can be considered as a hierarchical process in which animals satisfy their habitat requirements at different ecological scales. Theory predicts that spatial and temporal scales should co‐vary in most ecological processes and that the most limiting factors should drive habitat selection at coarse ecological scales, but be less influential at finer scales. Using detailed location data on roe deer Capreolus capreolus inhabiting the Bavarian Forest National Park, Germany, we investigated habitat selection at several spatial and temporal scales. We tested 1) whether time‐varying patterns were governed by factors reported as having the largest effects on fitness, 2) whether the trade‐off between forage and predation risks differed among spatial and temporal scales and 3) if spatial and temporal scales are positively associated. We analysed the variation in habitat selection within the landscape and within home ranges at monthly intervals, with respect to land‐cover type and proxys of food and cover over seasonal and diurnal temporal scales. The fine‐scale temporal variation follows a nycthemeral cycle linked to diurnal variation in human disturbance. The large‐scale variation matches seasonal plant phenology, suggesting food resources being a greater limiting factor than lynx predation risk. The trade‐off between selection for food and cover was similar on seasonal and diurnal scale. Habitat selection at the different scales may be the consequence of the temporal variation and predictability of the limiting factors as much as its association with fitness. The landscape of fear might have less importance at the studied scale of habitat selection than generally accepted because of the predator hunting strategy. Finally, seasonal variation in habitat selection was similar at the large and small spatial scales, which may arise because of the marked philopatry of roe deer. The difference is supposed to be greater for wider ranging herbivores.

- TC Grafenau

Vortrag

- Robert Hable

Big Data Analytics im Unternehmen: Strategien, Praxisbeispiele und Methoden

In: Advanced Analytics Infrastructure Dialog

Frankfurt am Main

- 04.12.2017 (2017)

- TC Grafenau
- Angewandte Informatik
- DIGITAL

Zeitschriftenartikel

- H. Albrecht
- J. Gallitz
- Robert Hable
- M. Vieth
- G. Tontini
- M. Neurath
- J. Riemann
- H. Neumann

The Offer of Advanced Imaging Techniques Leads to Higher Acceptance Rates for Screening Colonoscopy - a Prospective Study

In: Asian Pacific Journal of Cancer Prevention vol. 17 pg. 3871-3875.

- (2016)

Colonoscopy plays a fundamental role in early diagnosis and management of colorectal cancer and requires public and professional acceptance to ensure the ongoing success of screening programs. The aim of the study was to prospectively assess whether patient acceptance rates to undergo screening colonoscopy could be improved by the offer of advanced imaging techniques. Materials and Methods Overall, 372 randomly selected patients were prospectively included. A standardized questionnaire was developed that inquired of the patients their knowledge regarding advanced imaging techniques. Second, several media campaigns and information events were organized reporting about advanced imaging techniques, followed by repeated evaluation. After one year the evaluation ended. Results At baseline, 64% of the patients declared that they had no knowledge about new endoscopic methods. After twelve months the overall grade of information increased signi cantly from 14% at baseline to 34%. The percentage of patients who decided to undergo colonoscopy because of the offer of new imaging methods also increased signi cantly from 12% at baseline to 42% after 12 months. Conclusions Patients were highly interested in the offer of advanced imaging techniques. Knowledge about these techniques could relatively easy be provided using local media campaigns. The offer of advanced imaging techniques leads to higher acceptance rates for screening colonoscopies.

- TC Grafenau
- Angewandte Informatik
- GESUND

Zeitschriftenartikel

- K. Strohriegel
- Robert Hable

Qualitative robustness of estimators on stochastic processes

In: Metrika vol. 79 pg. 895-917.

- (2016)

DOI: 10.1007/s00184-016-0582-z

A lot of statistical methods originally designed for independent and identically distributed (i.i.d.) data are also successfully used for dependent observations. Still most theoretical investigations on robustness assume i.i.d. pairs of random variables. We examine an important property of statistical estimators—the qualitative robustness in the case of observations which do not fulfill the i.i.d. assumption. In the i.i.d. case qualitative robustness of a sequence of estimators is, according to Hampel (Ann Math Stat 42:1887–1896, 1971), ensured by continuity of the corresponding statistical functional. A similar result for the non-i.i.d. case is shown in this article. Continuity of the corresponding statistical functional still ensures qualitative robustness of the estimator as long as the data generating process satisfies a certain convergence condition on its empirical measure. Examples for processes providing such a convergence condition, including certain Markov chains or mixing processes, are given as well as examples for qualitatively robust estimators in the non-i.i.d. case.

- Angewandte Informatik
- TC Grafenau
- DIGITAL

Zeitschriftenartikel

- H. Albrecht
- J. Gallitz
- Robert Hable
- M. Vieth
- G. Tontini
- M. Neurath
- J. Riemann
- H. Neumann

The Offer of Advanced Imaging Techniques Leads to Higher Acceptance Rates for Screening Colonoscopy - A Prospective Multivariate Analysis of Data From a Patient Questionnaire

In: Gastrointestinal Endoscopy vol. 83 pg. AB359-AB360.

- (2016)

DOI: 10.1016/j.gie.2016.03.916

- Angewandte Informatik
- TC Grafenau
- GESUND
- DIGITAL

Zeitschriftenartikel

- O. Fishkis
- K. Müller
- Robert Hable
- B. Huwe

Effects of Throughfall Exclusion, Soil Texture and Spatial Continuity on Soil Water Repellency in Fichtel Mountains

In: Soil Science Society of America Journal vol. 80 pg. 554-562.

- (2016)

DOI: 10.2136/sssaj2015.10.0386

The occurrence of soil water repellency (SWR) in soil is controlled by soil organic matter (SOM) composition and is strongly soil-moisture dependent. During drying the reduction of water content in soil has been shown to induce the outward orientation of nonpolar ends of organic compounds and hence the increase in SWR. A prolonged drought can however also induce changes in SOM composition which in turn can affect SWR. In this study, we eliminate differences in water content after prolonged throughfall exclusion and a control treatment by oven-drying of the soil samples, to test if a prolonged drought affects SWR even after excluding the direct effect of soil moisture. In addition, the relevance of soil texture variability and spatial dependence of SWR for prediction of soil wettability distribution over the study area was explored. The samples of the upper mineral soil horizon were taken from six plots in Fichtel Mountains, subjected to a throughfall exclusion or control treatments, oven-dried and analyzed for soil texture and water drop penetration time (WDPT). A linear model with spatially correlated random effects was used to quantify the effects of soil texture and treatment on the persistence of the SWR and to simultaneously evaluate the spatial structure of the SWR. Based on estimated parameters the persistence of SWR was calculated on unsampled locations by robust kriging with external drift. The throughfall exclusion treatment significantly increased the log(WDPT) (p < 0.01) of the oven-dried soil by 0.46. The clay content and the sand content had highly significant (p < 0.001) negative effects, while silt content had positive effects on the log(WDPT). The variogram parameter with a range of 5.2 m, a nugget of 0.25, and a sill of 0.45 indicated a rather low degree of spatial dependence of log(WDPT). The main outcome of this study is that the positive effect of throughfall exclusion on SWR cannot be fully attributed to water content reduction. Most probably the drought-induced changes in SOM composition and microbial community were responsible for the observed increase in SWR.

- TC Grafenau
- Angewandte Informatik
- NACHHALTIG

Vortrag

- Robert Hable

Statistical Properties of Support Vector Machines and Related Methods from Machine Learning: Theory and Applications

In: 1. Bayerisch-Tschechische Wissenschaftskonferenz "Datenanalyse"

Jindřichův Hradec, Tschechische Republik

- 02.-03.07.2015 (2015)

- TC Grafenau
- Angewandte Informatik

Zeitschriftenartikel

- O. Fishkis
- M. Wachten
- Robert Hable

Assessment of soil water repellency as a function of soil moisture with mixed modelling

In: European Journal of Soil Science vol. 66 pg. 910-920.

DOI: 10.1111/ejss.12283

An understanding of the relation between soil water repellency (SWR) and soil moisture is a prerequisite of water-flow modelling in water-repellent soil. Here, the relation between SWR and soil moisture was investigated with intact cores of soil taken from three types of soil with different particle-size distributions. The SWR was measured by a sessile drop contact angle (CA) during drying at soil pF values that ranged from −∞ to 4.2. From the measured CA, the work of adhesion (Wa) was calculated and its relation with the pF-value was explored. Mixed modelling was applied to evaluate the effects of pF, soil type and soil depth on CA and Wa. For all soil types, a positive relation was observed between CA and the pF-value that could be represented by a linear model for the pF-range of 1–4.2. The variation in slope and intercept of the CA–pF relationship caused by heterogeneity of the samples taken from a single soil horizon was quantified. In addition, the relation between CA and water content (WC) showed hysteresis, with significantly larger CAs during drying than during wetting.

- TC Grafenau

Zeitschriftenartikel

- A.-L. Boulesteix
- Robert Hable
- S. Lauer
- M.J.A. Eugster

A Statistical Framework for Hypothesis Testing in Real Data Comparison Studies

In: The American Statistician vol. 69 pg. 201-212.

DOI: 10.1080/00031305.2015.1005128

In computational sciences, including computational statistics, machine learning, and bioinformatics, it is often claimed in articles presenting new supervised learning methods that the new method performs better than existing methods on real data, for instance in terms of error rate. However, these claims are often not based on proper statistical tests and, even if such tests are performed, the tested hypothesis is not clearly defined and poor attention is devoted to the Type I and Type II errors. In the present article, we aim to fill this gap by providing a proper statistical framework for hypothesis tests that compare the performances of supervised learning methods based on several real datasets with unknown underlying distributions. After giving a statistical interpretation of ad hoc tests commonly performed by computational researchers, we devote special attention to power issues and outline a simple method of determining the number of datasets to be included in a comparison study to reach an adequate power. These methods are illustrated through three comparison studies from the literature and an exemplary benchmarking study using gene expression microarray data. All our results can be reproduced using R codes and datasets available from the companion website http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/compstud2013.

- TC Grafenau
- DIGITAL

Vortrag

- Robert Hable

Nichtparametrische Klassifikation und Regression mit maschinellen Lernverfahren: Theorie und Anwendungen

Fraunhofer-Institut für Techno- und Wirtschaftsmathematik Kaiserslautern

- 03.08.2015 (2015)

- Angewandte Informatik
- TC Grafenau

Vortrag

- Robert Hable

Maschinelles Lernen: Datenanalyse mit Methoden der künstlichen Intelligenz . Gastvortrag in Vorlesung zu "Big Data-Algorithms and Systems"

Hochschule Landshut Landshut

- 07.07.2015 (2015)

- Angewandte Informatik
- TC Grafenau

Monographie

- Robert Hable

Einführung in die Stochastik . Ein Begleitbuch zur Vorlesung

In: SpringerSpektrum Lehrbuch

Springer Berlin [u.a.]

- (2015)

- TC Grafenau

Zeitschriftenartikel

- B. Mislimshoeva
- Robert Hable
- C. Samimi
- A. Abdulnazarov
- M. Fezakov
- T. Koellner

Factors Influencing Households' Firewood Consumption in the Western Pamirs, Tajikistan

In: Mountain Research and Development vol. 34 pg. 147-156.

DOI: 10.1659/MRD-JOURNAL-D-13-00113.1

Firewood is a major energy source, especially in many high mountainous regions in developing countries where other energy sources are limited. In the mountainous regions of Tajikistan, current energy consumption is limited owing to geographic isolation and numerous challenges—including in the energy sector—that emerged after the collapse of the Soviet Union and Tajikistan's independence. The sudden disruption of external supplies of energy forced people to rely on locally available but scarce biomass resources, such as firewood and animal dung. We conducted an empirical study to gain an understanding of current household energy consumption in the Western Pamirs of Tajikistan and the factors that influence firewood consumption. For this purpose, we interviewed members of 170 households in 8 villages. We found that, on average, households consumed 355 kg of firewood, 253 kWh of electricity, 760 kg of dung, and 6 kg of coal per month in the winter of 2011–2012. Elevation, size of a household's private garden, and total hours of heating had a positive relationship with firewood consumption, and education level and access to a reliable supply of electricity showed a negative relationship.

- TC Grafenau

Vortrag

- Robert Hable

Nichtparametrische Klassifikation und Regression mit SVMs und anderen regularisierten Kern-Verfahren: Statistische Modelle und Inferenz

In: Kolloquium des Instituts für Medizinische Biometrie und Statistik der Universität Lübeck

Lübeck

- 14.11.2014 (2014)

- Angewandte Informatik
- TC Grafenau

Zeitschriftenartikel

- Robert Hable
- A. Christmann

Estimation of Scale Functions to Model Heteroscedasticity by Kernel Based Quantile Methods

In: Journal of Nonparametric Statistics vol. 26 pg. 219-239.

DOI: 10.1080/10485252.2013.875547

A main goal of regression is to derive statistical conclusions on the conditional distribution of the output variable Y given the input values x. Two of the most important characteristics of a single distribution are location and scale. Regularised kernel methods (RKMs) – also called support vector machines in a wide sense – are well established to estimate location functions like the conditional median or the conditional mean. We investigate the estimation of scale functions by RKMs when the conditional median is unknown, too. Estimation of scale functions is important, e.g. to estimate the volatility in finance. We consider the median absolute deviation (MAD) and the interquantile range as measures of scale. Our main result shows the consistency of MAD-type RKMs.

- TC Grafenau

Zeitschriftenartikel

- D. Skulj
- Robert Hable

Coefficients of ergodicity for Markov chains with uncertain parameters

In: Metrika vol. 76 pg. 107-133.

DOI: 10.1007/s00184-011-0378-0

ne of the central considerations in the theory of Markov chains is their convergence to an equilibrium. Coefficients of ergodicity provide an efficient method for such an analysis. Besides giving sufficient and sometimes necessary conditions for convergence, they additionally measure its rate. In this paper we explore coefficients of ergodicity for the case of imprecise Markov chains. The latter provide a convenient way of modelling dynamical systems where parameters are not determined precisely. In such cases a tool for measuring the rate of convergence is even more important than in the case of precisely determined Markov chains, since most of the existing methods of estimating the limit distributions are iterative. We define a new coefficient of ergodicity that provides necessary and sufficient conditions for convergence of the most commonly used class of imprecise Markov chains. This so-called weak coefficient of ergodicity is defined through an endowment of the structure of a metric space to the class of imprecise probabilities. Therefore we first make a detailed analysis of the metric properties of imprecise probabilities.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- A. Christmann
- Robert Hable

On the Bootstrap Approach for Support Vector Machines and Related Kernel Based Methods

- (2013)

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

Universal Consistency of Localized Versions of Regularized Kernel Methods

In: Journal of Machine Learning Research vol. 14 pg. 111-144.

In supervised learning problems, global and local learning algorithms are used. In contrast to global learning algorithms, the prediction of a local learning algorithm in a testing point is only based on training data which are close to the testing point. Every global algorithm such as support vector machines (SVM) can be localized in the following way: in every testing point, the (global) learning algorithm is not applied to the whole training data but only to the k nearest neighbors (kNN) of the testing point. In case of support vector machines, the success of such mixtures of SVM and kNN (called SVM-KNN) has been shown in extensive simulation studies and also for real data sets but only little has been known on theoretical properties so far. In the present article, it is shown how a large class of regularized kernel methods (including SVM) can be localized in order to get a universally consistent learning algorithm.

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

Asymptotic Normality of Support Vector Machine Variants and Other Regularized Kernel Methods

In: Journal of Multivariate Analysis vol. 106 pg. 92-117.

DOI: 10.1016/j.jmva.2011.11.004

In nonparametric classification and regression problems, regularized kernel methods, in particular support vector machines, attract much attention in theoretical and in applied statistics. In an abstract sense, regularized kernel methods (simply called SVMs here) can be seen as regularized M-estimators for a parameter in a (typically infinite dimensional) reproducing kernel Hilbert space. For smooth loss functions LL, it is shown that the difference between the estimator, i.e. the empirical SVM View the MathML sourcefL,Dn,λDn, and the theoretical SVM fL,P,λ0fL,P,λ0 is asymptotically normal with rate View the MathML sourcen. That is, View the MathML sourcen(fL,Dn,λDn−fL,P,λ0) converges weakly to a Gaussian process in the reproducing kernel Hilbert space. As common in real applications, the choice of the regularization parameter View the MathML sourceDn in View the MathML sourcefL,Dn,λDn may depend on the data. The proof is done by an application of the functional delta-method and by showing that the SVM-functional P↦fL,P,λP↦fL,P,λ is suitably Hadamard-differentiable.

- TC Grafenau

Zeitschriftenartikel

- A. Christmann
- Robert Hable

Consistency of support vector machines using additive kernels for additive models

In: Computational Statistics & Data Analysis vol. 56 pg. 854-873.

DOI: 10.1016/j.csda.2011.04.006

Support vector machines (SVMs) are special kernel based methods and have been among the most successful learning methods for more than a decade. SVMs can informally be described as kinds of regularized MM-estimators for functions and have demonstrated their usefulness in many complicated real-life problems. During the last few years a great part of the statistical research on SVMs has concentrated on the question of how to design SVMs such that they are universally consistent and statistically robust for nonparametric classification or nonparametric regression purposes. In many applications, some qualitative prior knowledge of the distribution View the MathML sourceP or of the unknown function ff to be estimated is present or a prediction function with good interpretability is desired, such that a semiparametric model or an additive model is of interest. The question of how to design SVMs by choosing the reproducing kernel Hilbert space (RKHS) or its corresponding kernel to obtain consistent and statistically robust estimators in additive models is addressed. An explicit construction of such RKHSs and their kernels, which will be called additive kernels, is given. SVMs based on additive kernels will be called additive support vector machines . The use of such additive kernels leads, in combination with a Lipschitz continuous loss function, to SVMs with the desired properties for additive models. Examples include quantile regression based on the pinball loss function, regression based on the ϵϵ-insensitive loss function, and classification based on the hinge loss function.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- M. Troffaes
- Robert Hable

Robustness of Natural Extension

- (2011)

- TC Grafenau

Zeitschriftenartikel

- Robert Hable
- A. Christmann

On Qualitative Robustness of Support Vector Machines

In: Journal of Multivariate Analysis vol. 102 pg. 993-1007.

DOI: 10.1016/j.jmva.2011.01.009

Support vector machines (SVMs) have attracted much attention in theoretical and in applied statistics. The main topics of recent interest are consistency, learning rates and robustness. We address the open problem whether SVMs are qualitatively robust. Our results show that SVMs are qualitatively robust for any fixed regularization parameter λλ. However, under extremely mild conditions on the SVM, it turns out that SVMs are not qualitatively robust any more for any null sequence λnλn, which are the classical sequences needed to obtain universal consistency. This lack of qualitative robustness is of a rather theoretical nature because we show that, in any case, SVMs fulfill a finite sample qualitative robustness property.
For a fixed regularization parameter, SVMs can be represented by a functional on the set of all probability measures. Qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of SVMs, our results show that SVMs are the solutions of a well-posed mathematical problem in Hadamard’s sense.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- T. Kroupa
- Robert Hable

Structure of the Set of Belief Functions Generated by a Random Closed Interval

- (2010)

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

A Minimum Distance Estimator in an Imprecise Probability Model - Computational Aspects and Applications

In: International Journal of Approximate Reasoning vol. 51 pg. 1114-1128.

DOI: 10.1016/j.ijar.2010.08.003

The article considers estimating a parameter θ in an imprecise probability model View the MathML source(P¯θ)θ∈Θ which consists of coherent upper previsions View the MathML sourceP¯θ. After the definition of a minimum distance estimator in this setup and a summarization of its main properties, the focus lies on applications. It is shown that approximate minimum distances on the discretized sample space can be calculated by linear programming. After a discussion of some computational aspects, the estimator is applied in a simulation study consisting of two different models. Finally, the estimator is applied on a real data set in a linear regression model.

- TC Grafenau

Zeitschriftenartikel

- T. Augustin
- Robert Hable

On the impact of robust statistics on imprecise probability models: a review

In: Structural Safety vol. 32 pg. 358-365.

DOI: 10.1016/j.strusafe.2010.06.002

Robust statistics is concerned with statistical methods that still lead to reliable conclusions if an ideal model is only approximately true. More recently, the theory of imprecise probabilities was developed as a general methodology to model non-stochastic uncertainty (ambiguity) adequately, and has been successfully applied to many engineering problems. In robust statistics, small deviations from ideal models are modeled by certain neighborhoods. Since nearly all commonly used neighborhoods are imprecise probabilities, a large part of robust statistics can be seen as a special case of imprecise probabilities. Therefore, it seems quite promising to address problems in the theory of imprecise probabilities by trying to generalize results of robust statistics. In this review paper, we present some cases where this has already been done successfully and where the connections between (frequentist) robust statistics and imprecise probabilities are most striking.

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

Minimum Distance Estimation in Imprecise Probability Models

In: Journal of Statistical Planning and Inference vol. 140 pg. 461-479.

The present article considers estimating a parameter θθ in an imprecise probability model View the MathML source(P¯θ)θ∈Θ. This model consists of coherent upper previsions View the MathML sourceP¯θ which are given by finite numbers of constraints on expectations. A minimum distance estimator is defined in this case and its asymptotic properties are investigated. It is shown that the minimum distance can be approximately calculated by discretizing the sample space. Finally, the estimator is applied in a simulation study and on a real data set.

- TC Grafenau

Zeitschriftenartikel

- Robert Hable
- P. Ruckdeschel
- H. Rieder

Optimal robust influence functions in semiparametric regression

In: Journal of Statistical Planning and Inference vol. 140 pg. 226-245.

DOI: 10.1016/j.jspi.2009.07.010

Robust statistics allows the distribution of the observations to be any member of a suitable neighborhood about an ideal model distribution. In this paper, the ideal models are semiparametric with finite-dimensional parameter of interest and a possibly infinite-dimensional nuisance parameter.
In the asymptotic setup of shrinking neighborhoods, we derive and study the Hampel-type problem and the minmax MSE-problem. We show that, for all common types of neighborhood systems, the optimal influence function View the MathML sourceψ˜ can be approximated by the optimal influence functions View the MathML sourceψ˜n for certain parametric models.
For general semiparametric regression models, we determine View the MathML source(ψ˜n)n∈N in case of error-in-variables and in case of error-free-variables.
Finally, the results are applied to Cox regression where we compare our approach to that of Bednarski [1993. Robust estimation in Cox's regression model. Scand. J. Statist. 20, 213–225] in a small simulation study and on a real data set.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- T. Augustin
- Robert Hable

On the impact of robust statistics on imprecise probability models: a review

- (2009)

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

Data-Based Decisions under Imprecise Probability and Least Favorable Models.

In: International Journal of Approximate Reasoning vol. 50 pg. 642-654.

DOI: 10.1016/j.ijar.2008.03.009

Data-based decision theory under imprecise probability has to deal with optimization problems where direct solutions are often computationally intractable. Using the ΓΓ-minimax optimality criterion, the computational effort may significantly be reduced in the presence of a least favorable model. Buja [A. Buja, Simultaneously least favorable experiments. I. Upper standard functionals and sufficiency, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 65 (1984) 367–384] derived a necessary and sufficient condition for the existence of a least favorable model in a special case. The present article proves that essentially the same result is valid in case of general coherent upper previsions. This is done mainly by topological arguments in combination with some of Le Cam’s decision theoretic concepts. It is shown how least favorable models could be used to deal with situations where the distribution of the data as well as the prior is allowed to be imprecise.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- Robert Hable

A Minimum Distance Estimator in an Imprecise Probability Model - Computational Aspects and Applications

- (2009)

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- D. Skulj
- Robert Hable

Coefficients of ergodicity for imprecise Markov chaines

- (2009)

Coefficients of ergodicity are an important tool in measuring convergence of Markov chains. We explore possibilities to generalise the concept to imprecise Markov chains. We find that this can be done in at least two different ways, which both have interesting implications in the study of convergence of imprecise Markov chains. Thus we extend the existing definition of the uniform coefficient of ergodicity and define a new so-called weak coefficient of ergodicity. The definition is based on the endowment of a structure of a metric space to the class of imprecise probabilities. We show that this is possible to do in some different ways, which turn out to coincide.

- TC Grafenau

Zeitschriftenartikel

- Robert Hable

Finite approximations of data-based decision problems under imprecise probabilities.

In: International Journal of Approximate Reasoning vol. 50 pg. 1115-1128.

DOI: 10.1016/j.ijar.2009.05.003

In decision theory under imprecise probabilities, discretizations are a crucial topic because many applications involve infinite sets whereas most procedures in the theory of imprecise probabilities can only be calculated for finite sets so far. The present paper develops a method for discretizing sample spaces in data-based decision theory under imprecise probabilities. The proposed method turns an original decision problem into a discretized decision problem. It is shown that any solution of the discretized decision problem approximately solves the original problem.
In doing so, it is pointed out that the commonly used method of natural extension can be most instable. A way to avoid this instability is presented which is sufficient for the purpose of the paper.

- TC Grafenau

Beitrag in Sammelwerk/Tagungsband

- Robert Hable

Data-Based Decisions under Imprecise Probability and Least Favorable Models . And Supplements to the Article

pg. 203-212.

- (2007)

- TC Grafenau