Master's Oral
Monday 14 February at 15h00
A409, General Engineering Building
In audio-visual automatic speech recognition (AVASR) both acoustic and visual modalities of speech are used for speech recognition purposes. The use of the visual speech modality for speech recognition is motivation by the ability of hearing-impaired listener to understand speech from visual cues only through so-called lip-reading. AVASR is in particular expected to perform better than traditional audio-only speech recognition systems as the visual channel is not affected by acoustic noise.
The components comprising an AVASR system are acoustic and visual feature extraction, feature stream weighting, feature stream integration, model learning, and classification. In this study we focus on the feature stream weighting, feature stream integration, model learning, and classification problems.
Probability theory, with its inherent notions of uncertainty and confidence, is a natural approach to solving these problems. We have chosen to focus on the specific class of probabilistic models that can be formulated as dynamic Bayesian networks (DBNs). DBNs are an extension to Bayesian Networks which allow for modelling variable-length sequences of observed and unobserved random variables such as sequences of features extracted from speech samples. DBNs also allow for integrating stream weighing and stream integration techniques at the modelling level.
Our experimental results show that there is indeed information in the visual speech modality useful for speech recognition. We further find that models that combine acoustic and visual speech features in general perform better than models that only use acoustic features. We find that stream weighting based on measuring the reliability in the respective feature streams is beneficial to AVASR. We also show that a feature stream integration scheme that allows for asynchrony between the streams but also models the natural correlation between the acoustic and visual speech modalities.
Master's Oral
Tuesday 1 February at 16h00
A409, General Engineering Building
The domination parameters of a graph can behave in three ways when a set of k vertices is removed; it can either increase, decrease or stay unchanged. The purpose of this thesis is to investigate when the domination parameters of a graph decrease with the removal of any set of k vertices.
The case where only one vertex is removed is well documented in literature, but there still remain a few open questions regarding the criticality of the graph with respect to the lower irredundance number. One of these questions asks whether there exists a graph that is critical with respect to the domination number, but not critical with respect to the lower irredundance number, or whether there exists a graph that is critical with respect to the lower irredundance number, but not critical with respect to the domination number. The existence of two classes of graphs with this property will be demonstrated by using the structure of coalescence.
The behaviour of the domination parameters of a graph when more than one vertex is removed was previously only investigated for the domination number of a graph. This concept is expanded to the lower independence number and the upper domination parameters.
Master's Oral
Monday 24 January at 16h00
A409, General Engineering Building
An off-line signature verification system attempts to authenticate the identity of an individual by examining his/her handwritten signature, after it has been successfully extracted from, for example, a cheque, a debit or credit card transaction slip, or any other legal document. The questioned signature is typically compared to a model trained from known positive samples, after which the system attempts to label said signature as genuine or fraudulent. In this presentation a novel off-line signature verification system, using a multi-hypothesis approach and classifier fusion, is proposed. Each base classifier is constructed from a hidden Markov model (HMM) that is trained from features extracted from local regions of the signature (local features), as well as from the signature as a whole (global features). To achieve this, each signature is zoned into a number of overlapping circular retinas, from which said features are extracted by implementing the discrete Radon transform. A global retina, that encompasses the entire signature, is also considered.
Signatures obtained from so-called” guinea-pig” writers (for example, bank employees) constitute a convenient optimisation set that is used to select the most proficient ensemble of base classifiers. A signature, that is claimed to belong to a legitimate client (member of the general public), is therefore rejected or accepted based on the majority vote decision of the base classifiers within themost proficient ensemble.
When evaluated on a data set containing high-quality imitations, the inclusion of local features, together with classifier combination, significantly increases system performance. An equal error rate of 8.6% is achieved, which compares favorably to an achieved equal error rate of 12.9% (an improvement of 33.3%) when only global features are considered.
Master's Oral
Tuesday 16 November at 16h00
A409, General Engineering Building
Extracting data from video streams and using the data to better understand the observed world allows many systems to automatically perform tasks that ordinarily needed to be completed by humans. One such problem with a wide range of applications is that of detecting and tracking people in a video sequence. This thesis looks specifically at the problem of estimating the positions of players on a sports field, as observed by a multi-view camera setup. Previous attempts at solving the problem are discussed, after which the problem is broken down into three stages: detection, 2D tracking and 3D position estimation. Possible solutions to each of the problems are discussed and compared to one another. Motion detection is found to be a fast and effective solution to the problem of detecting players in a single view. Tracking players in 2D image coordinates is performed by implementing a hierarchical approach to the particle filter. The hierarchical approach is chosen as it improves the computational complexity without compromising on accuracy. Finally 3D position estimation is done by multi-view, forward projection triangulation. The components are combined to form a full system that is able to find and locate players on a sports field. The overall system that is developed is able to detect, track and triangulate player positions. The components are tested individually and found to perform well. By combining the components and introducing feedback between them the results of the individual components as well as those of the overall system are improved.
PhD Defence
Tuesday 19 October at 16h00
A409, General Engineering Building
Super-resolution imaging is the process whereby several low-resolution photographs of an object are combined to form a single high-resolution estimation. We investigate each component of this process: image acquisition, registration and reconstruction.
A new method, based on the discrete pulse transform, is developed to detect important image features. We compute and store the transform efficiently, and use the resulting features to obtain an accurate alignment between input images.
To simplify reconstruction, the imaging model is linearised, whereafter a polygon-based interpolation operator is introduced to model the underlying camera sensor. Finally, a large, sparse, over-determined system of linear equations is solved, using regularisation.
The software developed to perform these computations is made available under an open source license, and may be used to verify the results shown.
PhD-mondeling
Dinsdag 19 Oktober om 16h00
A409, Algemene Ingenieursgebou
Super-resolusie beeldvorming kombineer verskeie lae-resolusie foto’s van ‘n onderwerp in ‘n enkele, hoë-resolusie afskatting. Ons ondersoek elke stap van hierdie proses: beeldvorming, -belyning en hoë-resolusie samestelling.
‘n Nuwe metode, wat staatmaak op die diskrete pulstransform, is ontwikkel om belangrike beeldkenmerke te vind. Die pulstransform word effektief bereken en kompak gestoor, waarna ‘n akkurate belyning tussen intreebeelde verkry word op grond van die kenmerke.
Met die oog op vereenvoudigde rekonstruksie stel ons ‘n lineêre beeldvormingsmodel voor, waar die kamerasensor gemodelleer word in terme van veelhoek-interpolasie. Uiteindelik los ons ‘n groot, yl, oorbepaalde stelsel lineêre vergelykings op met behulp van regularisering.
Die sagteware ontwikkel vir hierdie berekeninge is beskikbaar onderhewig aan ‘n oopbron-lisensie, en kan vryelik gebruik word om die gegewe resultate te verifieer.
Master's Oral
Wednesday 12 May at 14h00
A409, General Engineering Building
In this dissertation we study the machine learning subfield of Reinforcement Learning (RL). After developing a coherent background, we apply a Monte Carlo control algorithm with exploring starts (MCES), as well as an off-policy Temporal-Difference (TD) learning control algorithm, Q-learning, to a simplified version of the Weapon Assignment (WA) problem.
For the MCES control algorithm, a discount parameter of γ = 1 is used. This gives very promising results when applied to 7 × 7 grids, as well as 71 × 71 grids. The same discount parameter cannot be applied to the Q-learning algorithm, as it causes the Q-values to diverge. We take a greedy approach, setting = 0, and vary the learning rate (α) and the discount parameter (γ). Experimentation shows that the best results are found with α set to 0.1 and γ constrained in the region 0.4 ≤ γ ≤ 0.7.
The MC control algorithm with exploring starts gives promising results when applied to the WA problem. It performs significantly better than the off-policy TD algorithm, Q-learning, even though it is almost twice as slow.
The modern battlefield is a fast paced, information rich environment, where discovery of intent, situation awareness and the rapid evolution of concepts of operation and doctrine are critical success factors. Combining the techniques investigated and tested in this work with other techniques in Artificial Intelligence (AI) and modern computational techniques may hold the key to solving some of the problems we now face in warfare.
Enquiries: Prof. B.M. herbst (Tel: 021 808 4217 or )
| Teaching |
| Research |
| Consultation |
| Presentations |
| SANUM Conference |
| Other conferences |
| Competitions |
| Staff |
| Students |
| Resources |
Enter search terms or a module, class or function name.