var bibbase_data = {"data":"\"Loading..\"\n\n
\n\n \n\n \n\n \n \n\n \n\n \n \n\n \n\n \n
\n generated by\n \n \"bibbase.org\"\n\n \n
\n \n\n
\n\n \n\n\n
\n\n Excellent! Next you can\n create a new website with this list, or\n embed it in an existing web page by copying & pasting\n any of the following snippets.\n\n
\n JavaScript\n (easiest)\n
\n \n <script src=\"https://bibbase.org/show?bib=https%3A%2F%2Fraw.githubusercontent.com%2FRoznn%2FRoznn.github.io%2Fmaster%2Fworks.bib&jsonp=1&showSearch=1&jsonp=1\"></script>\n \n
\n\n PHP\n
\n \n <?php\n $contents = file_get_contents(\"https://bibbase.org/show?bib=https%3A%2F%2Fraw.githubusercontent.com%2FRoznn%2FRoznn.github.io%2Fmaster%2Fworks.bib&jsonp=1&showSearch=1\");\n print_r($contents);\n ?>\n \n
\n\n iFrame\n (not recommended)\n
\n \n <iframe src=\"https://bibbase.org/show?bib=https%3A%2F%2Fraw.githubusercontent.com%2FRoznn%2FRoznn.github.io%2Fmaster%2Fworks.bib&jsonp=1&showSearch=1\"></iframe>\n \n
\n\n

\n For more details see the documention.\n

\n
\n
\n\n
\n\n This is a preview! To use this list on your own web site\n or create a new web site from it,\n create a free account. The file will be added\n and you will be able to edit it in the File Manager.\n We will show you instructions once you've created your account.\n
\n\n
\n\n

To the site owner:

\n\n

Action required! Mendeley is changing its\n API. In order to keep using Mendeley with BibBase past April\n 14th, you need to:\n

    \n
  1. renew the authorization for BibBase on Mendeley, and
  2. \n
  3. update the BibBase URL\n in your page the same way you did when you initially set up\n this page.\n
  4. \n
\n

\n\n

\n \n \n Fix it now\n

\n
\n\n
\n\n\n
\n \n \n
\n
\n  \n 2024\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A real-time Digital Twin for active safety in an aircraft hangar.\n \n \n \n \n\n\n \n Casey, L.; Dooley, J.; Codd, M.; Dahyot, R.; Cognetti, M.; Mullarkey, T.; Redmond, P.; and Lacey, G.\n\n\n \n\n\n\n Frontiers in Virtual Reality, 5. 2024.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@ARTICLE{SmartHangar2024,\nAUTHOR={Luke Casey and John Dooley and Michael Codd and Rozenn Dahyot and Marco Cognetti and Thomas Mullarkey and Peter Redmond and Gerard Lacey},   \nTITLE={A real-time Digital Twin for active safety in an aircraft hangar},      \nJOURNAL={Frontiers in Virtual Reality},      \nVOLUME={5},           \nYEAR={2024},      \nURL={https://www.frontiersin.org/articles/10.3389/frvir.2024.1372923},       \nDOI={10.3389/frvir.2024.1372923},      \nISSN={},   \nABSTRACT={The aerospace industry prioritises safety protocols to prevent accidents that can result in injuries, fatalities, or aircraft damage. One of the potential hazards that can occur while manoeuvring aircraft in and out of a hangar is collisions with other aircraft or buildings, which can lead to operational disruption and costly repairs. To tackle this issue, we have developed the Smart Hangar project, which aims to alert personnel of increased risks and prevent incidents from happening. The Smart Hangar project uses computer vision, LiDAR, and ultra-wideband sensors to track all objects and individuals within the hangar space. These data inputs are combined to form a real-time 3D Digital Twin (DT) of the hangar environment. The Active Safety system then uses the DT to perform real-time path planning, collision prediction, and safety alerts for tow truck drivers and hangar personnel. This paper provides a detailed overview of the system architecture, including the technologies used, and highlights the system's performance. By implementing this system, we aim to reduce the risk of accidents in the aerospace industry and increase safety for all personnel involved.\n Additionally, we identify future research directions for the Smart Hangar project.}\n}\n\n
\n
\n\n\n
\n The aerospace industry prioritises safety protocols to prevent accidents that can result in injuries, fatalities, or aircraft damage. One of the potential hazards that can occur while manoeuvring aircraft in and out of a hangar is collisions with other aircraft or buildings, which can lead to operational disruption and costly repairs. To tackle this issue, we have developed the Smart Hangar project, which aims to alert personnel of increased risks and prevent incidents from happening. The Smart Hangar project uses computer vision, LiDAR, and ultra-wideband sensors to track all objects and individuals within the hangar space. These data inputs are combined to form a real-time 3D Digital Twin (DT) of the hangar environment. The Active Safety system then uses the DT to perform real-time path planning, collision prediction, and safety alerts for tow truck drivers and hangar personnel. This paper provides a detailed overview of the system architecture, including the technologies used, and highlights the system's performance. By implementing this system, we aim to reduce the risk of accidents in the aerospace industry and increase safety for all personnel involved. Additionally, we identify future research directions for the Smart Hangar project.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Roznn/Bootstrap: Latex sources for beamer slides on resampling techniques.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Feb 2024.\n \n\n\n\n
\n\n\n\n \n \n \"Roznn/Bootstrap:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@software{rozenn_dahyot_2024_10636835,\n  author       = {Rozenn Dahyot},\n  title        = {{Roznn/Bootstrap: Latex sources for beamer slides \n                   on resampling techniques}},\n  abstract={Latex, figures and PDF lecturenotes  used for teaching Bootstrap, Jackknife and other resampling methods},\n  month        = {Feb},\n  year         = {2024},\n  publisher    = {Zenodo},\n  version      = {v1.0},\n  doi          = {10.5281/zenodo.10636835},\n  url          = {https://github.com/Roznn/Bootstrap/}\n}\n\n
\n
\n\n\n
\n Latex, figures and PDF lecturenotes used for teaching Bootstrap, Jackknife and other resampling methods\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2023\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n CONTEXT AWARE OBJECT GEOTAGGING.\n \n \n \n \n\n\n \n Liu, C.; Ulicny, M.; and Dahyot, R.\n\n\n \n\n\n\n June 2023.\n \n\n\n\n
\n\n\n\n \n \n \"CONTEXTPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@patent{Patent2023,\n  author = "Chao-jung Liu and Matej Ulicny and Rozenn Dahyot",\n  title = "CONTEXT AWARE OBJECT GEOTAGGING",\n  nationality = "United States",\n  number = "18087227",\n  day = "29",\n  month = "June",\n  year = "2023",\n  url = "https://patents.google.com/patent/US20230206402A1/en"\n}\n\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Improving GMM registration with class encoding.\n \n \n \n \n\n\n \n Panahi, S.; Chopin, J.; Ulicny, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2023), Galway, Ireland, 2023. \n https://github.com/solmak97/GMMReg_Extension\n\n\n\n
\n\n\n\n \n \n \"ImprovingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 3 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{Panahi2023,\nauthor  =  {Solmaz Panahi and Jeremy Chopin and Matej Ulicny and Rozenn Dahyot}, \ntitle  =  {Improving  GMM  registration with class encoding},\nbooktitle  =  {Irish Machine Vision and Image Processing (IMVIP 2023)},\naddress =  {Galway, Ireland},\nvolume  =  {},\nyear  =  {2023},\nabstract  =  {Point set registration is critical in many applications such as  computer vision, pattern recognition, or in fields like robotics and medical imaging.\nThis paper focuses on reformulating point set registration using Gaussian Mixture Models while considering attributes associated with each point. Our approach introduces class score vectors as additional features \nto the spatial data information. By incorporating these attributes, we enhance the optimization process by penalizing incorrect matching terms. Experimental results show that our approach \nwith class scores outperforms the original algorithm  in both accuracy and speed.},\nurl  =  {https://zenodo.org/records/8205096/files/Improving_GMM_registration_with_class_encodings.pdf},\ndoi  =  {10.5281/zenodo.8205096},\nkeywords = {registration, GMM, class encoding},\nnote  =  {https://github.com/solmak97/GMMReg_Extension}\n}\n\n\n\n
\n
\n\n\n
\n Point set registration is critical in many applications such as computer vision, pattern recognition, or in fields like robotics and medical imaging. This paper focuses on reformulating point set registration using Gaussian Mixture Models while considering attributes associated with each point. Our approach introduces class score vectors as additional features to the spatial data information. By incorporating these attributes, we enhance the optimization process by penalizing incorrect matching terms. Experimental results show that our approach with class scores outperforms the original algorithm in both accuracy and speed.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Query Based Acoustic Summarization for Podcasts.\n \n \n \n \n\n\n \n Kotey, S.; Dahyot, R.; and Harte, N.\n\n\n \n\n\n\n In Proceedings INTERSPEECH 2023, pages 1483–1487, Dublin, Ireland, August 2023. \n \n\n\n\n
\n\n\n\n \n \n \"QueryPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 3 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{KoteyInterSpeech2023,\nauthor  =  {Samantha Kotey and Rozenn Dahyot and Naomi Harte},\ntitle  =  {Query Based Acoustic Summarization for Podcasts}, \nbooktitle = {Proceedings INTERSPEECH 2023},\nyear  =  {2023},\nvolume  =  {},\nnumber  =  {},\npages  =  {1483--1487},\nabstract  =  {Podcasts are a rich storytelling medium of long diverse conversations. \nTypically, listeners preview an episode through an audio clip, before deciding to consume the content. \nAn automatic system that produces promotional clips, by supporting acoustic queries would greatly benefit podcasters. \nPrevious text based methods do not use the acoustic signal directly or incorporate acoustic defined queries.\nTherefore, we propose a query based summarization approach, to produce audio clip summaries from podcast data. \nLeveraging unsupervised clustering methods, we apply our framework to the Spotify podcasts dataset.\nAudio signals are transformed into acoustic word embeddings, along with a pre-selected candidate query.\nWe initiate the cluster centroids with the query vector and obtain the final snippets by computing a global and local similarity score. \nAdditionally, we apply our framework to the AMI meeting dataset and demonstrate how audio can successfully be utilized to perform summarization.},\nkeywords  =  {query-based summarization, unsupervised speech summarization, clustering, acoustic word embeddings},\ndoi  =  {10.21437/Interspeech.2023-864},\nurl  =  {https://www.isca-archive.org/interspeech_2023/kotey23_interspeech.pdf},\nISSN  =  {},\naddress  =  {Dublin, Ireland},\nmonth  =  {August}\n}\n\n
\n
\n\n\n
\n Podcasts are a rich storytelling medium of long diverse conversations. Typically, listeners preview an episode through an audio clip, before deciding to consume the content. An automatic system that produces promotional clips, by supporting acoustic queries would greatly benefit podcasters. Previous text based methods do not use the acoustic signal directly or incorporate acoustic defined queries. Therefore, we propose a query based summarization approach, to produce audio clip summaries from podcast data. Leveraging unsupervised clustering methods, we apply our framework to the Spotify podcasts dataset. Audio signals are transformed into acoustic word embeddings, along with a pre-selected candidate query. We initiate the cluster centroids with the query vector and obtain the final snippets by computing a global and local similarity score. Additionally, we apply our framework to the AMI meeting dataset and demonstrate how audio can successfully be utilized to perform summarization.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Combining geolocation and height estimation of objects from street level imagery.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V. A.; Connelly, J.; and Dahyot, R.\n\n\n \n\n\n\n Technical Report 2023.\n \n\n\n\n
\n\n\n\n \n \n \"CombiningPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{ulicny2023combining,\ndoi = {10.48550/arXiv.2305.08232},\nurl = {https://arxiv.org/pdf/2305.08232.pdf},\nabstract = {We propose a pipeline for combined multi-class object geolocation and \nheight estimation from street level RGB imagery, which is considered as a single available input data modality. \nOur solution is formulated via Markov Random Field optimization with deterministic output. \nThe proposed technique uses image metadata along with coordinates of objects detected in the image plane as \nfound by a custom-trained Convolutional Neural Network. Computing the object height using our methodology, \nin addition to object geolocation, has negligible effect on the overall computational cost. \nAccuracy is demonstrated experimentally for water drains and road signs on which we achieve average elevation estimation error lower than 20cm.},\ntitle = {Combining geolocation and height estimation of objects from street level imagery}, \nauthor  =  {Matej Ulicny and Vladimir A. Krylov and Julie Connelly and Rozenn Dahyot},\nyear = {2023},\neprint = {2305.08232},\narchivePrefix = {arXiv},\nprimaryClass = {cs.CV}\n}\n\n
\n
\n\n\n
\n We propose a pipeline for combined multi-class object geolocation and height estimation from street level RGB imagery, which is considered as a single available input data modality. Our solution is formulated via Markov Random Field optimization with deterministic output. The proposed technique uses image metadata along with coordinates of objects detected in the image plane as found by a custom-trained Convolutional Neural Network. Computing the object height using our methodology, in addition to object geolocation, has negligible effect on the overall computational cost. Accuracy is demonstrated experimentally for water drains and road signs on which we achieve average elevation estimation error lower than 20cm.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Model-based inexact graph matching on top of DNNs for semantic scene understanding.\n \n \n \n \n\n\n \n Chopin, J.; Fasquel, J.; Mouchère, H.; Dahyot, R.; and Bloch, I.\n\n\n \n\n\n\n Computer Vision and Image Understanding,103744. 2023.\n \n\n\n\n
\n\n\n\n \n \n \"Model-basedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{CHOPIN2023103744,\ntitle  =  {Model-based inexact graph matching on top of DNNs for semantic scene understanding},\njournal  =  {Computer Vision and Image Understanding},\npages  =  {103744},\nyear  =  {2023},\nissn  =  {1077-3142},\ndoi  =  {10.1016/j.cviu.2023.103744},\nurl  =  {https://arxiv.org/pdf/2301.07468.pdf},\nauthor  =  {Jeremy Chopin and Jean-Baptiste Fasquel and Harold Mouchère and Rozenn Dahyot and Isabelle Bloch},\nkeywords  =  {Graph matching, Deep learning, Image segmentation, Volume segmentation, Quadratic assignment problem},\nabstract  =  {Deep learning based pipelines for semantic segmentation often ignore structural information available \non annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about\n the objects of interest to improve segmentation results provided by deep neural networks (DNNs). This module corresponds\n  to a “many-to-one-or-none” inexact graph matching approach, and is formulated as a quadratic assignment problem.\n   Our approach is compared to a DNN-based segmentation on two public datasets, one for face segmentation from 2D RGB images\n    (FASSEG), and the other for brain segmentation from 3D MRIs (IBSR). Evaluations are performed using two types of structural information: \n    distances and directional relations that are user defined, this choice being a hyper-parameter of our proposed generic framework. \n    On FASSEG data, results show that our module improves accuracy of the DNN by about 6.3\\% \n    i.e. the Hausdorff distance (HD) decreases from 22.11 to 20.71 on average. With IBSR data, the improvement is of 51\\% \n    better accuracy with HD decreasing from 11.01 to 5.4. Finally, our approach is shown to be resilient to small training \n    datasets that often limit the performance of deep learning methods: the improvement increases as the size of the training dataset decreases.}\n}\n\n\n
\n
\n\n\n
\n Deep learning based pipelines for semantic segmentation often ignore structural information available on annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about the objects of interest to improve segmentation results provided by deep neural networks (DNNs). This module corresponds to a “many-to-one-or-none” inexact graph matching approach, and is formulated as a quadratic assignment problem. Our approach is compared to a DNN-based segmentation on two public datasets, one for face segmentation from 2D RGB images (FASSEG), and the other for brain segmentation from 3D MRIs (IBSR). Evaluations are performed using two types of structural information: distances and directional relations that are user defined, this choice being a hyper-parameter of our proposed generic framework. On FASSEG data, results show that our module improves accuracy of the DNN by about 6.3% i.e. the Hausdorff distance (HD) decreases from 22.11 to 20.71 on average. With IBSR data, the improvement is of 51% better accuracy with HD decreasing from 11.01 to 5.4. Finally, our approach is shown to be resilient to small training datasets that often limit the performance of deep learning methods: the improvement increases as the size of the training dataset decreases.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Fine Grained Spoken Document Summarization Through Text Segmentation.\n \n \n \n \n\n\n \n Kotey, S.; Dahyot, R.; and Harte, N.\n\n\n \n\n\n\n In 2022 IEEE Spoken Language Technology Workshop (SLT), pages 647-654, Jan 2023. \n \n\n\n\n
\n\n\n\n \n \n \"FinePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 6 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{KoteySLT2023,\nauthor = {Kotey, Samantha and Dahyot, Rozenn and Harte, Naomi},\nbooktitle = {2022 IEEE Spoken Language Technology Workshop (SLT)}, \ntitle = {Fine Grained Spoken Document Summarization Through Text Segmentation}, \nyear = {2023},\nvolume = {},\nnumber = {},\npages = {647-654},\nabstract = {Podcast transcripts are long spoken documents of conversational dialogue. \nChallenging to summarize, podcasts cover a diverse range of topics, vary in length, and \nhave uniquely different linguistic styles. Previous studies in podcast summarization have generated short,\nconcise dialogue summaries. In contrast, we propose a method to generate long fine-grained summaries, which describe details of \nsub-topic narratives. Leveraging a readability formula, we curate a data subset to train a long sequence transformer for abstractive summarization.\nThrough text segmentation, we filter the evaluation data and exclude specific segments of text. \nWe apply the model to segmented data, producing different types of fine grained summaries. We show that appropriate filtering \ncreates comparable results on ROUGE and serves as an alternative method to truncation. Experiments show our model outperforms \nprevious studies on the Spotify podcast dataset when tasked with generating longer sequences of text.},\nkeywords = {},\ndoi = {10.1109/SLT54892.2023.10022829},\nurl = {https://roznn.github.io/PDF/STL2022Kotey.pdf},\nISSN = {},\nmonth = {Jan},}\n\n\n
\n
\n\n\n
\n Podcast transcripts are long spoken documents of conversational dialogue. Challenging to summarize, podcasts cover a diverse range of topics, vary in length, and have uniquely different linguistic styles. Previous studies in podcast summarization have generated short, concise dialogue summaries. In contrast, we propose a method to generate long fine-grained summaries, which describe details of sub-topic narratives. Leveraging a readability formula, we curate a data subset to train a long sequence transformer for abstractive summarization. Through text segmentation, we filter the evaluation data and exclude specific segments of text. We apply the model to segmented data, producing different types of fine grained summaries. We show that appropriate filtering creates comparable results on ROUGE and serves as an alternative method to truncation. Experiments show our model outperforms previous studies on the Spotify podcast dataset when tasked with generating longer sequences of text.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2022\n \n \n (5)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Principal Component Classification.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Technical Report 2022.\n \n\n\n\n
\n\n\n\n \n \n \"PrincipalPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 9 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@techreport{Dahyot_PCC2022,\nauthor  =  {Rozenn Dahyot},\nkeywords  =  {Supervised Learning, PCA, classification, metric learning, deep learning, class encoding},\nabstract = {We propose to directly compute classification estimates\nby learning features encoded with their class scores. \nOur resulting model has a encoder-decoder structure suitable for supervised learning, \nit is computationally efficient and performs well for classification on several datasets.},\ntitle  =  {Principal Component Classification},\npublisher  =  {arXiv},\nyear  =  {2022},\ndoi  =  {10.48550/ARXIV.2210.12746},\nurl  =  {https://arxiv.org/pdf/2210.12746.pdf},\ncopyright  =  {Creative Commons Attribution 4.0 International}\n}\n\n
\n
\n\n\n
\n We propose to directly compute classification estimates by learning features encoded with their class scores. Our resulting model has a encoder-decoder structure suitable for supervised learning, it is computationally efficient and performs well for classification on several datasets.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Harmonic Convolutional Networks based on Discrete Cosine Transform.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n Pattern Recognition, 129: 1-12. 2022.\n arXiv.2001.06570 Github: https://github.com/matej-ulicny/harmonic-networks\n\n\n\n
\n\n\n\n \n \n \"HarmonicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 10 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{ULICNY2022108707, \nauthor =  {Matej Ulicny and Vladimir A. Krylov and Rozenn Dahyot}, \ntitle =  {Harmonic Convolutional Networks based on Discrete Cosine Transform}, \njournal = {Pattern Recognition},\nabstract = {Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. We propose to learn these filters as combinations of preset spectral filters defined by the Discrete Cosine Transform (DCT). Our proposed DCT-based harmonic blocks replace conventional convolutional layers to produce partially or fully harmonic versions of new or existing CNN architectures. Using DCT energy compaction properties, we demonstrate how the harmonic networks can be efficiently compressed by truncating high-frequency information in harmonic blocks thanks to the redundancies in the spectral domain. We report extensive experimental validation demonstrating benefits of the introduction of harmonic blocks into state-of-the-art CNN models in image classification, object detection and semantic segmentation applications.},\nvolume =  {129},\npages = {1-12},\nyear =  {2022}, \nissn  =  {0031-3203},\nurl =  {https://arxiv.org/pdf/2001.06570.pdf},\ndoi = {10.1016/j.patcog.2022.108707},\nnote = {arXiv.2001.06570  Github: https://github.com/matej-ulicny/harmonic-networks},\n}\n
\n
\n\n\n
\n Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. We propose to learn these filters as combinations of preset spectral filters defined by the Discrete Cosine Transform (DCT). Our proposed DCT-based harmonic blocks replace conventional convolutional layers to produce partially or fully harmonic versions of new or existing CNN architectures. Using DCT energy compaction properties, we demonstrate how the harmonic networks can be efficiently compressed by truncating high-frequency information in harmonic blocks thanks to the redundancies in the spectral domain. We report extensive experimental validation demonstrating benefits of the introduction of harmonic blocks into state-of-the-art CNN models in image classification, object detection and semantic segmentation applications.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Improving semantic segmentation with graph-based structural knowledge.\n \n \n \n \n\n\n \n Chopin, J.; Fasquel, J.; Mouchere, H.; Dahyot, R.; and Bloch, I.\n\n\n \n\n\n\n In El Yacoubi, M.; Granger, E.; Yuen, P. C.; Pal, U.; and Vincent, N., editor(s), Pattern Recognition and Artificial Intelligence, pages 173–184, Paris, France, June 2022. Springer International Publishing\n hal-03633029\n\n\n\n
\n\n\n\n \n \n \"ImprovingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ChopinICPRAI2022a,\ntitle = {Improving semantic segmentation with graph-based structural knowledge},\nauthor = {J. Chopin and J.-B. Fasquel and H. Mouchere and R. Dahyot and I. Bloch},\nabstract = {Deep learning based pipelines for semantic segmentation often\nignore structural information available on annotated images used for\ntraining. We propose a novel post-processing module enforcing structural\nknowledge about the objects of interest to improve segmentation\nresults provided by deep learning. This module corresponds to a “manyto-\none-or-none” inexact graph matching approach, and is formulated as\na quadratic assignment problem. Using two standard measures for evaluation,\nwe show experimentally that our pipeline for segmentation of\n3D MRI data of the brain outperforms the baseline CNN (U-Net) used\nalone. In addition, our approach is shown to be resilient to small training\ndatasets that often limit the performance of deep learning.},\ndoi = {10.1007/978-3-031-09037-0_15},\nurl =  {https://hal.science/hal-03633029/document}, \nnote = {hal-03633029},\nbooktitle = {Pattern Recognition and Artificial Intelligence},\nyear = {2022},\npublisher = {Springer International Publishing},\neditor = {El Yacoubi, Moun{\\^i}m\nand Granger, Eric\nand Yuen, Pong Chi\nand Pal, Umapada\nand Vincent, Nicole},\nmonth = {June},\nHAL_ID  =  {hal-03633029},\naddress = {Paris, France},\nisbn = {978-3-031-09037-0},\npages = {173--184},\n}
\n
\n\n\n
\n Deep learning based pipelines for semantic segmentation often ignore structural information available on annotated images used for training. We propose a novel post-processing module enforcing structural knowledge about the objects of interest to improve segmentation results provided by deep learning. This module corresponds to a “manyto- one-or-none” inexact graph matching approach, and is formulated as a quadratic assignment problem. Using two standard measures for evaluation, we show experimentally that our pipeline for segmentation of 3D MRI data of the brain outperforms the baseline CNN (U-Net) used alone. In addition, our approach is shown to be resilient to small training datasets that often limit the performance of deep learning.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n QAP Optimisation with Reinforcement Learning for Faster Graph Matching in Sequential Semantic Image Analysis.\n \n \n \n \n\n\n \n Chopin, J.; Fasquel, J.; Mouchere, H.; Dahyot, R.; and Bloch, I.\n\n\n \n\n\n\n In El Yacoubi, M.; Granger, E.; Yuen, P. C.; Pal, U.; and Vincent, N., editor(s), Pattern Recognition and Artificial Intelligence, pages 47–58, Paris, France, June 2022. Springer International Publishing\n hal-03633036\n\n\n\n
\n\n\n\n \n \n \"QAPPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ChopinICPRAI2022b,\ntitle = {QAP Optimisation with Reinforcement Learning for Faster Graph Matching in Sequential Semantic Image Analysis},\nauthor = {J. Chopin and J.-B. Fasquel and H. Mouchere and R. Dahyot and I. Bloch},\nabstract = {The paper addresses the fundamental task of semantic image\nanalysis by exploiting structural information (spatial relationships\nbetween image regions). We propose to perform such semantic image\nanalysis by combining a deep neural network (CNN) with graph matching\nwhere graphs encode efficiently structural information related to regions\nsegmented by the CNN. Our novel approach solves the quadratic assignment\nproblem (QAP) sequentially for matching graphs. The optimal\nsequence for graph matching is conveniently defined using reinforcementlearning\n(RL) based on the region membership probabilities produced by\nthe CNN and their structural relationships. Our RL based strategy for\nsolving QAP sequentially allows us to significantly reduce the combinatioral\ncomplexity for graph matching. Preliminary experiments are performed\non both a synthetic dataset and a public dataset dedicated to the\nsemantic segmentation of face images. Results show that the proposed\nRL-based ordering dramatically outperforms random ordering, and that\nour strategy is about 386 times faster than a global QAP-based approach,\nwhile preserving similar segmentation accuracy.},\npublisher = {Springer International Publishing},\neditor = {El Yacoubi, Moun{\\^i}m\nand Granger, Eric\nand Yuen, Pong Chi\nand Pal, Umapada\nand Vincent, Nicole},\nisbn = {978-3-031-09037-0},\ndoi = {10.1007/978-3-031-09037-0_5},\nurl =  {https://hal.science/hal-03633036/document}, \nnote = {hal-03633036},\nbooktitle = {Pattern Recognition and Artificial Intelligence},\nyear = {2022},\nmonth = {June},\npages = {47--58},\naddress = {Paris, France},\n}
\n
\n\n\n
\n The paper addresses the fundamental task of semantic image analysis by exploiting structural information (spatial relationships between image regions). We propose to perform such semantic image analysis by combining a deep neural network (CNN) with graph matching where graphs encode efficiently structural information related to regions segmented by the CNN. Our novel approach solves the quadratic assignment problem (QAP) sequentially for matching graphs. The optimal sequence for graph matching is conveniently defined using reinforcementlearning (RL) based on the region membership probabilities produced by the CNN and their structural relationships. Our RL based strategy for solving QAP sequentially allows us to significantly reduce the combinatioral complexity for graph matching. Preliminary experiments are performed on both a synthetic dataset and a public dataset dedicated to the semantic segmentation of face images. Results show that the proposed RL-based ordering dramatically outperforms random ordering, and that our strategy is about 386 times faster than a global QAP-based approach, while preserving similar segmentation accuracy.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n DR-VNet: Retinal Vessel Segmentation via Dense Residual UNet.\n \n \n \n \n\n\n \n Karaali, A.; Dahyot, R.; and Sexton, D. J.\n\n\n \n\n\n\n In El Yacoubi, M.; Granger, E.; Yuen, P. C.; Pal, U.; and Vincent, N., editor(s), Pattern Recognition and Artificial Intelligence, volume abs/2111.04739, Paris, France, June 2022. Springer International Publishing\n Github https://github.com/alikaraali/DR-VNet, ArXivDOI:10.48550/arXiv.2111.04739\n\n\n\n
\n\n\n\n \n \n \"DR-VNet:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{karaali2022drvnet,\ntitle = {DR-VNet: Retinal Vessel Segmentation via Dense Residual UNet}, \nauthor = {Ali Karaali and Rozenn Dahyot and Donal J. Sexton},\nyear = {2022},\nbooktitle = {Pattern Recognition and Artificial Intelligence},\ndoi = {10.1007/978-3-031-09037-0_17},\nnote = {Github https://github.com/alikaraali/DR-VNet, ArXivDOI:10.48550/arXiv.2111.04739},\nurl =  {https://arxiv.org/pdf/2111.04739.pdf}, \nabstract = {Accurate retinal vessel segmentation is an important task for many computer-aided diagnosis systems. Yet, it is still a challenging problem due to the complex vessel structures of an eye. Numerous vessel segmentation methods have been proposed recently, however more research is needed to deal with poor segmentation of thin and tiny vessels. To address this, we propose a new deep learning pipeline combining the efficiency of residual dense net blocks and, residual squeeze and excitation blocks. We validate experimentally our approach on three datasets and show that our pipeline outperforms current state of the art techniques on the sensitivity metric relevant to assess capture of small vessels.},\npublisher = {Springer International Publishing},\neditor = {El Yacoubi, Moun{\\^i}m\nand Granger, Eric\nand Yuen, Pong Chi\nand Pal, Umapada\nand Vincent, Nicole},\nisbn = {978-3-031-09037-0},\nvolume =  {abs/2111.04739},\nmonth = {June},\naddress = {Paris, France},\narchivePrefix = {arXiv},\nprimaryClass = {eess.IV}\n}\n
\n
\n\n\n
\n Accurate retinal vessel segmentation is an important task for many computer-aided diagnosis systems. Yet, it is still a challenging problem due to the complex vessel structures of an eye. Numerous vessel segmentation methods have been proposed recently, however more research is needed to deal with poor segmentation of thin and tiny vessels. To address this, we propose a new deep learning pipeline combining the efficiency of residual dense net blocks and, residual squeeze and excitation blocks. We validate experimentally our approach on three datasets and show that our pipeline outperforms current state of the art techniques on the sensitivity metric relevant to assess capture of small vessels.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2021\n \n \n (4)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Context Aware Object Geotagging.\n \n \n \n \n\n\n \n Liu, C.; Ulicny, M.; Manzke, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2021), 2021. \n \n\n\n\n
\n\n\n\n \n \n \"ContextPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 7 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ChaoImvip2021,\nauthor =  {C.-J. Liu and Matej Ulicny and Michael Manzke and  Rozenn Dahyot}, \ntitle =  {Context Aware Object Geotagging},\nbooktitle =  {Irish Machine Vision and Image Processing (IMVIP 2021)},\nvolume =  {},\nyear =  {2021},\nabstract = {We propose an approach for geolocating assets from street view imagery \nby improving the quality of the metadata associated with the images using \nStructure from Motion, and by using contextual geographic information extracted \nfrom OpenStreetMap. Our pipeline is validated experimentally against the state of\nthe art approaches for geotagging traffic lights.},\nurl =  {https://arxiv.org/pdf/2108.06302.pdf},\ndoi = {10.48550/arXiv.2108.06302},\nnote = {},\narchivePrefix =  {arXiv}, \neprint =  {},\ntimestamp =  {},\nbiburl =  {},\nbibsource =  {}\n}
\n
\n\n\n
\n We propose an approach for geolocating assets from street view imagery by improving the quality of the metadata associated with the images using Structure from Motion, and by using contextual geographic information extracted from OpenStreetMap. Our pipeline is validated experimentally against the state of the art approaches for geotagging traffic lights.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Model for predicting perception of facial action unit activation using virtual humans.\n \n \n \n \n\n\n \n McDonnell, R.; Zibrek, K.; Carrigan, E.; and Dahyot, R.\n\n\n \n\n\n\n Computers & Graphics , 100: 81-92. 2021.\n Winner 2022 Graphics Replicability Stamp Initiative (GRSI) best paper award; Github: https://github.com/Roznn/facial-blendshapes\n\n\n\n
\n\n\n\n \n \n \"ModelPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 5 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{McDonnell2021,\ntitle =  {Model for predicting perception of facial action unit activation using virtual humans},\njournal =  {Computers \\& Graphics }, \ndoi  =  {10.1016/j.cag.2021.07.022},\nvolume =  {100}, \npages =  {81-92}, \nyear =  {2021}, \nnote =  {Winner 2022 Graphics Replicability Stamp Initiative (GRSI) best paper award; Github: https://github.com/Roznn/facial-blendshapes}, \nissn =  {0097-8493},\nurl =  {https://roznn.github.io/facial-blendshapes/CAG2021.pdf}, \nauthor =  {Rachel McDonnell and Katja Zibrek and Emma Carrigan and Rozenn Dahyot}, \nkeywords =  {facial action unit, perception, virtual character},\nabstract =  {Blendshape facial rigs are used extensively in the industry for facial animation of\nvirtual humans. However, storing and manipulating large numbers of facial meshes\n(blendshapes) is costly in terms of memory and computation for gaming applications.\nBlendshape rigs are comprised of sets of semantically-meaningful expressions, which\ngovern how expressive the character will be, often based on Action Units from the Facial\nAction Coding System (FACS). However, the relative perceptual importance of blendshapes has not yet been investigated. Research in Psychology and Neuroscience has\nshown that our brains process faces differently than other objects so we postulate that\nthe perception of facial expressions will be feature-dependent rather than based purely\non the amount of movement required to make the expression. Therefore, we believe that\nperception of blendshape visibility will not be reliably predicted by numerical calculations of the difference between the expression and the neutral mesh. In this paper, we\nexplore the noticeability of blendshapes under different activation levels, and present\nnew perceptually-based models to predict perceptual importance of blendshapes. The\nmodels predict visibility based on commonly-used geometry and image-based metrics.}\n}
\n
\n\n\n
\n Blendshape facial rigs are used extensively in the industry for facial animation of virtual humans. However, storing and manipulating large numbers of facial meshes (blendshapes) is costly in terms of memory and computation for gaming applications. Blendshape rigs are comprised of sets of semantically-meaningful expressions, which govern how expressive the character will be, often based on Action Units from the Facial Action Coding System (FACS). However, the relative perceptual importance of blendshapes has not yet been investigated. Research in Psychology and Neuroscience has shown that our brains process faces differently than other objects so we postulate that the perception of facial expressions will be feature-dependent rather than based purely on the amount of movement required to make the expression. Therefore, we believe that perception of blendshape visibility will not be reliably predicted by numerical calculations of the difference between the expression and the neutral mesh. In this paper, we explore the noticeability of blendshapes under different activation levels, and present new perceptually-based models to predict perceptual importance of blendshapes. The models predict visibility based on commonly-used geometry and image-based metrics.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Sliced L2 Distance for Colour Grading.\n \n \n \n \n\n\n \n Alghamdi, H.; and Dahyot, R.\n\n\n \n\n\n\n In 29th European Signal Processing Conference (EUSIPCO), pages 671-675, 2021. \n https://eurasip.org/Proceedings/Eusipco/Eusipco2021/pdfs/0000671.pdf\n\n\n\n
\n\n\n\n \n \n \"SlicedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 6 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{alghamdi2021sliced,\ntitle  =  {Sliced L2 Distance for Colour Grading}, \nauthor  =  {Hana Alghamdi and Rozenn Dahyot},\nbooktitle  =  {29th European Signal Processing Conference (EUSIPCO)},\ndoi  =  {10.23919/EUSIPCO54536.2021.9616260},\nyear  =  {2021},\nvolume = {},\nnumber = {},\npages = {671-675},\neprint  =  {2102.09297},\narchivePrefix  =  {arXiv},\nprimaryClass  =  {cs.CV},\nabstract  =  {We propose a new method with L2 distance that maps one N-dimensional distribution to another,\ntaking into account available information about correspondences. We solve the high-dimensional problem \nin 1D space using an iterative projection approach. To show the potentials of this mapping, we apply it\nto colour transfer between two images that exhibit overlapped scenes. Experiments show quantitative and \nqualitative competitive results as compared with the state of the art colour transfer methods.},\nnote = {https://eurasip.org/Proceedings/Eusipco/Eusipco2021/pdfs/0000671.pdf},\nurl  =  {https://arxiv.org/pdf/2102.09297.pdf}\n}
\n
\n\n\n
\n We propose a new method with L2 distance that maps one N-dimensional distribution to another, taking into account available information about correspondences. We solve the high-dimensional problem in 1D space using an iterative projection approach. To show the potentials of this mapping, we apply it to colour transfer between two images that exhibit overlapped scenes. Experiments show quantitative and qualitative competitive results as compared with the state of the art colour transfer methods.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Tensor Reordering for CNN Compression.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3930-3934, 2021. \n Github: https://github.com/matej-ulicny/reorder-cnn-compression\n\n\n\n
\n\n\n\n \n \n \"TensorPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 6 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ulicny2020tensor,\ntitle = {Tensor Reordering for CNN Compression}, \nauthor = {Matej Ulicny and Vladimir A. Krylov and Rozenn Dahyot},\nbooktitle = {ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, \ndoi = {10.1109/ICASSP39728.2021.9413944},\npages = {3930-3934},\nnote = {Github: https://github.com/matej-ulicny/reorder-cnn-compression},\nabstract = {We show how parameter redundancy in Convolutional Neural Network (CNN) filters can be effectively reduced by pruning in spectral domain. Specifically, the representation extracted via Discrete Cosine Transform (DCT) is more conducive for pruning than the original space. By relying on a combination of weight tensor reshaping and reordering we achieve high levels of layer compression with just minor accuracy loss. Our approach is applied to compress pretrained CNNs and we show that minor additional fine-tuning allows our method to recover the original model performance after a significant parameter reduction. We validate our approach on ResNet-50 and MobileNet-V2 architectures for ImageNet classification task.},\nurl = {https://arxiv.org/pdf/2010.12110.pdf},\nyear = {2021},\neprint = {2010.12110},\narchivePrefix = {arXiv},\nprimaryClass = {cs.LG}\n}
\n
\n\n\n
\n We show how parameter redundancy in Convolutional Neural Network (CNN) filters can be effectively reduced by pruning in spectral domain. Specifically, the representation extracted via Discrete Cosine Transform (DCT) is more conducive for pruning than the original space. By relying on a combination of weight tensor reshaping and reordering we achieve high levels of layer compression with just minor accuracy loss. Our approach is applied to compress pretrained CNNs and we show that minor additional fine-tuning allows our method to recover the original model performance after a significant parameter reduction. We validate our approach on ResNet-50 and MobileNet-V2 architectures for ImageNet classification task.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2020\n \n \n (9)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Semantic image segmentation based on spatial relationships and inexact graph matching.\n \n \n \n \n\n\n \n Chopin, J.; Fasquel, J.; Mouchere, H.; Dahyot, R.; and Bloch, I.\n\n\n \n\n\n\n In 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), pages 1-6, Nov 2020. \n https://github.com/Jeremy-Chopin/FASSEG-instances\n\n\n\n
\n\n\n\n \n \n \"SemanticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{9286611, \nauthor =  {J. {Chopin} and J.B. {Fasquel} and H. {Mouchere} and R. {Dahyot} and I. {Bloch}}, \nbooktitle =  {2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA)},\ntitle =  {Semantic image segmentation based on spatial relationships and inexact graph matching}, \nyear =  {2020},\nvolume =  {}, \nnumber =  {}, \npages =  {1-6}, \nabstract =  {We propose a method for semantic image segmentation,  combining a deep neural network and spatial relationships between image regions, \nencoded in a graph representation of the scene. Our proposal is based on inexact graph matching, \nformulated as a quadratic assignment problem applied to the output of the neural network.\nThe proposed method is evaluated on a public dataset used for segmentation of images of faces, \nand compared to the U-Net deep neural network that is widely used for semantic segmentation. \nPreliminary results show that our approach is promising. In terms of Intersection-over-Union of region bounding boxes, \nthe improvement is of 2.4\\% in average, compared to U-Net, and up to 24.4\\% for some regions.\nFurther improvements are observed when reducing the size of the training dataset (up to 8.5\\% in average).}, \nkeywords =  {Computer vision;Deep learning;Inexact graph matching;Quadratic assignment problem},\ndoi =  {10.1109/IPTA50016.2020.9286611}, \nnote = {https://github.com/Jeremy-Chopin/FASSEG-instances},\nurl = {https://hal.science/hal-02916165/document},\nISSN =  {2154-512X}, month =  {Nov}}\n
\n
\n\n\n
\n We propose a method for semantic image segmentation, combining a deep neural network and spatial relationships between image regions, encoded in a graph representation of the scene. Our proposal is based on inexact graph matching, formulated as a quadratic assignment problem applied to the output of the neural network. The proposed method is evaluated on a public dataset used for segmentation of images of faces, and compared to the U-Net deep neural network that is widely used for semantic segmentation. Preliminary results show that our approach is promising. In terms of Intersection-over-Union of region bounding boxes, the improvement is of 2.4% in average, compared to U-Net, and up to 24.4% for some regions. Further improvements are observed when reducing the size of the training dataset (up to 8.5% in average).\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Bonseyes AI Pipeline—Bringing AI to You: End-to-End Integration of Data, Algorithms, and Deployment Tools.\n \n \n \n \n\n\n \n Prado, M. D.; Su, J.; Saeed, R.; Keller, L.; Vallez, N.; Anderson, A.; Gregg, D.; Benini, L.; Llewellynn, T.; Ouerhani, N.; Dahyot, R.; and Pazos, N.\n\n\n \n\n\n\n ACM Trans. Internet Things, 1(4). aug 2020.\n \n\n\n\n
\n\n\n\n \n \n \"BonseyesPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{10.1145/3403572,\nauthor =  {Prado, Miguel De and Su, Jing and Saeed, Rabia and Keller, Lorenzo and Vallez, Noelia and Anderson, Andrew and Gregg, David and Benini, Luca and Llewellynn, Tim and Ouerhani, Nabil and Dahyot, Rozenn and Pazos, Nuria}, \ntitle =  {Bonseyes AI Pipeline—Bringing AI to You: End-to-End Integration of Data, Algorithms, and Deployment Tools}, \nyear =  {2020}, \nissue_date =  {August 2020}, \npublisher =  {Association for Computing Machinery}, \naddress =  {New York, NY, USA},\nvolume =  {1},\nnumber =  {4}, \nissn =  {2691-1914}, \nurl =  {https://arxiv.org/pdf/1901.05049.pdf}, \ndoi =  {10.1145/3403572}, \njournal =  {ACM Trans. Internet Things},\nmonth =  {aug}, \narticleno =  {26}, \nnumpages =  {25},\nabstract  =  {Next generation of embedded Information and Communication Technology (ICT) systems are interconnected and collaborative systems able to perform autonomous tasks. The remarkable expansion of the embedded ICT market, together with the rise and breakthroughs of Artificial Intelligence (AI), have put the focus on the Edge as it stands as one of the keys for the next technological revolution: the seamless integration of AI in our daily life. However, training and deployment of custom AI solutions on embedded devices require a fine-grained integration of data, algorithms, and tools to achieve high accuracy and overcome functional and non-functional requirements. Such integration requires a high level of expertise that becomes a real bottleneck for small and medium enterprises wanting to deploy AI solutions on the Edge, which, ultimately, slows down the adoption of AI on applications in our daily life.In this work, we present a modular AI pipeline as an integrating framework to bring data, algorithms, and deployment tools together. By removing the integration barriers and lowering the required expertise, we can interconnect the different stages of particular tools and provide a modular end-to-end development of AI products for embedded devices. Our AI pipeline consists of four modular main steps: (i) data ingestion, (ii) model training, (iii) deployment optimization, and (iv) the IoT hub integration. To show the effectiveness of our pipeline, we provide examples of different AI applications during each of the steps. Besides, we integrate our deployment framework, Low-Power Deep Neural Network (LPDNN), into the AI pipeline and present its lightweight architecture and deployment capabilities for embedded devices. Finally, we demonstrate the results of the AI pipeline by showing the deployment of several AI applications such as keyword spotting, image classification, and object detection on a set of well-known embedded platforms, where LPDNN consistently outperforms all other popular deployment frameworks.},\nkeywords =  {deep learning, AI pipeline, keyword spotting, fragmentation}}\n\n
\n
\n\n\n
\n Next generation of embedded Information and Communication Technology (ICT) systems are interconnected and collaborative systems able to perform autonomous tasks. The remarkable expansion of the embedded ICT market, together with the rise and breakthroughs of Artificial Intelligence (AI), have put the focus on the Edge as it stands as one of the keys for the next technological revolution: the seamless integration of AI in our daily life. However, training and deployment of custom AI solutions on embedded devices require a fine-grained integration of data, algorithms, and tools to achieve high accuracy and overcome functional and non-functional requirements. Such integration requires a high level of expertise that becomes a real bottleneck for small and medium enterprises wanting to deploy AI solutions on the Edge, which, ultimately, slows down the adoption of AI on applications in our daily life.In this work, we present a modular AI pipeline as an integrating framework to bring data, algorithms, and deployment tools together. By removing the integration barriers and lowering the required expertise, we can interconnect the different stages of particular tools and provide a modular end-to-end development of AI products for embedded devices. Our AI pipeline consists of four modular main steps: (i) data ingestion, (ii) model training, (iii) deployment optimization, and (iv) the IoT hub integration. To show the effectiveness of our pipeline, we provide examples of different AI applications during each of the steps. Besides, we integrate our deployment framework, Low-Power Deep Neural Network (LPDNN), into the AI pipeline and present its lightweight architecture and deployment capabilities for embedded devices. Finally, we demonstrate the results of the AI pipeline by showing the deployment of several AI applications such as keyword spotting, image classification, and object detection on a set of well-known embedded platforms, where LPDNN consistently outperforms all other popular deployment frameworks.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Methode d'analyse semantique d'images combinant apprentissage profond et relations structurelles par appariement de graphes.\n \n \n \n \n\n\n \n Chopin, J.; Fasquel, J.; Mouchere, H.; Bloch, I.; and Dahyot, R.\n\n\n \n\n\n\n In Rencontres des Jeunes Chercheurs en Intelligence Artificielle (RJCIA 2020), Angers, France, jun 2020. \n https://hal.archives-ouvertes.fr/hal-02882043\n\n\n\n
\n\n\n\n \n \n \"MethodePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{chopin:hal-02882043,\ntitle  =  {Methode d'analyse semantique d'images combinant apprentissage profond et relations structurelles par appariement de graphes}, \nauthor =  {Chopin, J. and Fasquel, J.-B. and Mouchere, H. and Bloch, I. and Dahyot, R.}, \nurl = {https://hal.science/hal-02882043/document}, \ndoi = {},\nBOOKTITLE =  {Rencontres des Jeunes Chercheurs en Intelligence Artificielle (RJCIA 2020)}, \nADDRESS =  {Angers, France}, \nabstract = {We propose a method for semantic image segmentation, combining a deep neural network and spatial relationships between image regions, encoded in a graph representation of the scene. Our proposal is based on inexact graph matching, applied to the output of a deep neural network. The proposed method is evaluated on a public dataset used for segmentation of images of faces. Preliminary results show  that, in terms of IoU of region bounding boxes, the use of\nspatial relationships lead to an improvement of 2.4 percent in average, and up to 24.4 percent for some regions.},\nYEAR =  {2020}, \nMONTH =  {jun},\nnote =  {https://hal.archives-ouvertes.fr/hal-02882043}\n}\n\n\n\n
\n
\n\n\n
\n We propose a method for semantic image segmentation, combining a deep neural network and spatial relationships between image regions, encoded in a graph representation of the scene. Our proposal is based on inexact graph matching, applied to the output of a deep neural network. The proposed method is evaluated on a public dataset used for segmentation of images of faces. Preliminary results show that, in terms of IoU of region bounding boxes, the use of spatial relationships lead to an improvement of 2.4 percent in average, and up to 24.4 percent for some regions.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n IM2ELEVATION: Building Height Estimation from Single-View Aerial Imagery.\n \n \n \n \n\n\n \n Liu, C.; Krylov, V. A.; Kane, P.; Kavanagh, G.; and Dahyot, R.\n\n\n \n\n\n\n Remote Sensing, 12(17): 2719. August 2020.\n Github: https://github.com/speed8928/IMELE\n\n\n\n
\n\n\n\n \n \n \"IM2ELEVATION:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 6 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{Smile2020, \nnote = {Github: https://github.com/speed8928/IMELE},\nabstract = {Estimation of the Digital Surface Model (DSM) and building heights from single-view aerial imagery is a challenging inherently ill-posed problem that we address in this paper by resorting to machine learning. We propose an end-to-end trainable convolutional-deconvolutional deep neural network architecture that enables learning mapping from a single aerial imagery to a DSM for analysis of urban scenes. We perform multisensor fusion of aerial optical and aerial light detection and ranging (Lidar) data to prepare the training data for our pipeline. The dataset quality is key to successful estimation performance. Typically, a substantial amount of misregistration artifacts are present due to georeferencing/projection errors, sensor calibration inaccuracies, and scene changes between acquisitions. To overcome these issues, we propose a registration procedure to improve Lidar and optical data alignment that relies on Mutual Information, followed by Hough transform-based validation step to adjust misregistered image patches. We validate our building height estimation model on a high-resolution dataset captured over central Dublin, Ireland: Lidar point cloud of 2015 and optical aerial images from 2017. These data allow us to validate the proposed registration procedure and perform 3D model reconstruction from single-view aerial imagery. We also report state-of-the-art performance of our proposed architecture on several popular DSM estimation datasets},\ndoi =  {10.3390/rs12172719}, \nurl =  {https://www.mdpi.com/2072-4292/12/17/2719/pdf}, \nyear =  {2020}, \nmonth =  {August}, \nvolume =  {12}, \nnumber =  {17}, \npages =  {2719},\nauthor =  {C.-J. Liu and V. A. Krylov and P. Kane and G. Kavanagh and R. Dahyot}, \ntitle =  {IM2ELEVATION: Building Height Estimation from Single-View Aerial Imagery}, \njournal =  {Remote Sensing},\npublisher =  {{MDPI} {AG}}}\n
\n
\n\n\n
\n Estimation of the Digital Surface Model (DSM) and building heights from single-view aerial imagery is a challenging inherently ill-posed problem that we address in this paper by resorting to machine learning. We propose an end-to-end trainable convolutional-deconvolutional deep neural network architecture that enables learning mapping from a single aerial imagery to a DSM for analysis of urban scenes. We perform multisensor fusion of aerial optical and aerial light detection and ranging (Lidar) data to prepare the training data for our pipeline. The dataset quality is key to successful estimation performance. Typically, a substantial amount of misregistration artifacts are present due to georeferencing/projection errors, sensor calibration inaccuracies, and scene changes between acquisitions. To overcome these issues, we propose a registration procedure to improve Lidar and optical data alignment that relies on Mutual Information, followed by Hough transform-based validation step to adjust misregistered image patches. We validate our building height estimation model on a high-resolution dataset captured over central Dublin, Ireland: Lidar point cloud of 2015 and optical aerial images from 2017. These data allow us to validate the proposed registration procedure and perform 3D model reconstruction from single-view aerial imagery. We also report state-of-the-art performance of our proposed architecture on several popular DSM estimation datasets\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Investigating Perceptually Based Models to Predict Importance of Facial Blendshapes.\n \n \n \n \n\n\n \n Carrigan, E.; Zibrek, K.; Dahyot, R.; and McDonnell, R.\n\n\n \n\n\n\n In Motion, Interaction and Games, of MIG '20, New York, NY, USA, 2020. Association for Computing Machinery\n Awarded Best Short Paper Award MIG2020 Github: https://roznn.github.io/facial-blendshapes/\n\n\n\n
\n\n\n\n \n \n \"InvestigatingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{10.1145/3424636.3426904, \nauthor =  {Carrigan, Emma and Zibrek, Katja and Dahyot, Rozenn and McDonnell, Rachel}, \ntitle =  {Investigating Perceptually Based Models to Predict Importance of Facial Blendshapes}, \nyear =  {2020},\nisbn =  {9781450381710}, \npublisher =  {Association for Computing Machinery},\naddress =  {New York, NY, USA},\nurl =  {https://dl.acm.org/doi/pdf/10.1145/3424636.3426904}, \nnote = {Github: https://github.com/Roznn/facial-blendshapes},\ndoi =  {10.1145/3424636.3426904}, \nabstract =  {Blendshape facial rigs are used extensively in the industry for facial \nanimation of virtual humans. However, storing and manipulating large numbers of facial \nmeshes is costly in terms of memory and computation for gaming applications, yet the relative perceptual importance of blendshapes has not yet been investigated.\nResearch in Psychology and Neuroscience has shown that our brains process faces differently than other objects, so we postulate that \nthe perception of facial expressions will be feature-dependent rather than based purely on the amount of movement required to make the expression. \nIn this paper, we explore the noticeability of blendshapes under different activation levels, and present new perceptually based models to predict\nperceptual importance of blendshapes. The models predict visibility based on commonly-used geometry and image-based metrics. },\nbooktitle =  {Motion, Interaction and Games}, \narticleno =  {2}, \nnumpages =  {6}, \nnote = {Awarded Best Short Paper Award MIG2020 Github: https://roznn.github.io/facial-blendshapes/},\nkeywords =  {blendshapes, perception, action units, linear model}, \nlocation =  {Virtual Event, SC, USA}, \nseries =  {MIG '20}}\n
\n
\n\n\n
\n Blendshape facial rigs are used extensively in the industry for facial animation of virtual humans. However, storing and manipulating large numbers of facial meshes is costly in terms of memory and computation for gaming applications, yet the relative perceptual importance of blendshapes has not yet been investigated. Research in Psychology and Neuroscience has shown that our brains process faces differently than other objects, so we postulate that the perception of facial expressions will be feature-dependent rather than based purely on the amount of movement required to make the expression. In this paper, we explore the noticeability of blendshapes under different activation levels, and present new perceptually based models to predict perceptual importance of blendshapes. The models predict visibility based on commonly-used geometry and image-based metrics. \n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Iterative Nadaraya-Watson Distribution Transfer for Colour Grading.\n \n \n \n \n\n\n \n Alghamdi, H.; and Dahyot, R.\n\n\n \n\n\n\n In IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), pages 1-6, 2020. \n Github: https://github.com/leshep/INWDT\n\n\n\n
\n\n\n\n \n \n \"IterativePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{DBLP:journals/corr/abs-2006-09208, \nauthor =  {Hana Alghamdi and  Rozenn Dahyot}, \ntitle =  {Iterative Nadaraya-Watson Distribution Transfer for Colour Grading}, \nbooktitle =  {IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)},\nvolume =  {},\npages = {1-6},\nabstract = {We propose a new method with Nadaraya-Watson that maps one N-dimensional distribution to another,\ntaking into account available information about correspondences. \nWe extend the 2D/3D problem to higher dimensions by encoding overlapping neighborhoods \nof data points and solve the high-dimensional problem in 1D space using an iterative projection\napproach. To show the potentials of this mapping, we apply it to colour transfer between two images\nthat exhibit overlapped scenes. Experiments show quantitative and qualitative improvements\nover the previous state of the art colour transfer methods.},\nnote = {Github: https://github.com/leshep/INWDT},\ndoi = {10.1109/MMSP48831.2020.9287097},\nyear =  {2020}, \nurl =  {https://arxiv.org/pdf/2006.09208.pdf}}\n\n
\n
\n\n\n
\n We propose a new method with Nadaraya-Watson that maps one N-dimensional distribution to another, taking into account available information about correspondences. We extend the 2D/3D problem to higher dimensions by encoding overlapping neighborhoods of data points and solve the high-dimensional problem in 1D space using an iterative projection approach. To show the potentials of this mapping, we apply it to colour transfer between two images that exhibit overlapped scenes. Experiments show quantitative and qualitative improvements over the previous state of the art colour transfer methods.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Patch based Colour Transfer using SIFT Flow.\n \n \n \n \n\n\n \n Alghamdi, H.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2020), volume abs/2005.09015, 2020. \n Best Paper Award, Book Open Access http://research.thea.ie/handle/20.500.12065/3429\n\n\n\n
\n\n\n\n \n \n \"PatchPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{DBLP:journals/corr/abs-2005-09015,\nauthor =  {Hana Alghamdi and Rozenn Dahyot}, \ntitle =  {Patch based Colour Transfer using {SIFT} Flow},\nbooktitle =  {Irish Machine Vision and Image Processing (IMVIP 2020)},\nvolume =  {abs/2005.09015},\nyear =  {2020},\nabstract = {We propose a new colour transfer method with Optimal Transport (OT) to\ntransfer the colour of a sourceimage to match the colour of a target image of the same scene \nthat may exhibit large motion changes betweenimages. By definition OT does not take into account\nany available information about correspondences whencomputing the optimal solution. To tackle \nthis problem we propose to encode overlapping neighborhoodsof pixels using both their colour and \nspatial correspondences estimated using motion estimation. We solvethe high dimensional problem \nin 1D space using an iterative projection approach. We further introducesmoothing as part of \nthe iterative algorithms for solving optimal transport namely Iterative DistributionTransport (IDT) and\nits variant the Sliced Wasserstein Distance (SWD). Experiments show quantitative andqualitative \nimprovements over previous state of the art colour transfer methods.},\nurl =  {https://arxiv.org/pdf/2005.09015.pdf}, \nnote = {Best Paper Award, Book Open Access http://research.thea.ie/handle/20.500.12065/3429},\narchivePrefix =  {arXiv}, \neprint =  {2005.09015},\ntimestamp =  {Fri, 22 May 2020 16:21:29 +0200},\nbiburl =  {https://dblp.org/rec/journals/corr/abs-2005-09015.bib},\nbibsource =  {dblp computer science bibliography, https://dblp.org}}\n
\n
\n\n\n
\n We propose a new colour transfer method with Optimal Transport (OT) to transfer the colour of a sourceimage to match the colour of a target image of the same scene that may exhibit large motion changes betweenimages. By definition OT does not take into account any available information about correspondences whencomputing the optimal solution. To tackle this problem we propose to encode overlapping neighborhoodsof pixels using both their colour and spatial correspondences estimated using motion estimation. We solvethe high dimensional problem in 1D space using an iterative projection approach. We further introducesmoothing as part of the iterative algorithms for solving optimal transport namely Iterative DistributionTransport (IDT) and its variant the Sliced Wasserstein Distance (SWD). Experiments show quantitative andqualitative improvements over previous state of the art colour transfer methods.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Efficient Visual Place Retrieval System Using Google Street View.\n \n \n \n \n\n\n \n Aljuaidi, R.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2020), 2020. \n Book Open Access http://research.thea.ie/handle/20.500.12065/3429\n\n\n\n
\n\n\n\n \n \n \"EfficientPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ReemIMVIP2020,\nauthor =  {R. Aljuaidi and R. Dahyot}, \ntitle =  {Efficient Visual Place Retrieval System Using Google Street View},\nbooktitle =  {Irish Machine Vision and Image Processing (IMVIP 2020)},\nyear =  {2020},\nabstract = {},\nurl =  {http://research.thea.ie/handle/20.500.12065/3429}, \nnote = {Book Open Access http://research.thea.ie/handle/20.500.12065/3429},\n}\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n CNN based Color and Thermal Image Fusion for Object Detection in Automated Driving.\n \n \n \n \n\n\n \n Yadav, R.; Samir, A.; Rashed, H.; Yogamani, S.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2020), 2020. \n Book Open Access http://research.thea.ie/handle/20.500.12065/3429\n\n\n\n
\n\n\n\n \n \n \"CNNPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{YadavIMVIP2020,\nauthor =  {R. Yadav and A. Samir and H. Rashed and S. Yogamani and R. Dahyot}, \ntitle =  {CNN based Color and Thermal Image Fusion for Object Detection in Automated Driving},\nbooktitle =  {Irish Machine Vision and Image Processing (IMVIP 2020)},\nyear =  {2020},\nabstract = {Visual spectrum camera is a primary sensor in an automated driving system. It provides a high information\ndensity at a low cost. Visual perception is extensively studied in the literature and it is a mature\ncomponent deployed in existing commercial vehicles. Its main disadvantage is the performance degradation\nin low light scenarios. Thermal cameras are increasingly being used to complement cameras for dark conditions\nlike night time or driving through a tunnel. In this paper, we explore CNN based fusion architecture\nfor object detection. We explore two automotive datasets which provide data for both these sensors namely\nKAIST multispectral pedestrian dataset and FLIR thermal object detection dataset. We train baseline Faster-\nRCNN models for color only and thermal only models on KAIST dataset. Color model outperforms Thermal\nin day conditions and Thermal model outperforms color in night conditions illustrating their complementary\nnature. We construct a simple mid-level CNN fusion architecture which performs significantly better than\nthe baseline models. We observe an improvement of 0.62\\% in miss rate compared to existing methods.\nWe also explored the more recent FLIR dataset. Because of the vastly different resolution, aspect ratio and\nfield of view of the color and thermal images provided, our simple fusion architecture did not perform well\npointing out the need for further research in this area.},\nurl =  {https://research.thea.ie/bitstream/handle/20.500.12065/3429/IMVIP2020Proceedings.pdf}, \nnote = {Book Open Access http://research.thea.ie/handle/20.500.12065/3429},\n}\n
\n
\n\n\n
\n Visual spectrum camera is a primary sensor in an automated driving system. It provides a high information density at a low cost. Visual perception is extensively studied in the literature and it is a mature component deployed in existing commercial vehicles. Its main disadvantage is the performance degradation in low light scenarios. Thermal cameras are increasingly being used to complement cameras for dark conditions like night time or driving through a tunnel. In this paper, we explore CNN based fusion architecture for object detection. We explore two automotive datasets which provide data for both these sensors namely KAIST multispectral pedestrian dataset and FLIR thermal object detection dataset. We train baseline Faster- RCNN models for color only and thermal only models on KAIST dataset. Color model outperforms Thermal in day conditions and Thermal model outperforms color in night conditions illustrating their complementary nature. We construct a simple mid-level CNN fusion architecture which performs significantly better than the baseline models. We observe an improvement of 0.62% in miss rate compared to existing methods. We also explored the more recent FLIR dataset. Because of the vastly different resolution, aspect ratio and field of view of the color and thermal images provided, our simple fusion architecture did not perform well pointing out the need for further research in this area.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2019\n \n \n (13)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Patch-Based Colour Transfer with Optimal Transport.\n \n \n \n \n\n\n \n Alghamdi, H.; Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n In 2019 27th European Signal Processing Conference (EUSIPCO), pages 1-5, Sep. 2019. \n Github: https://github.com/leshep/PCT_OT\n\n\n\n
\n\n\n\n \n \n \"Patch-BasedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8902611, \nauthor =  {H. {Alghamdi} and M. {Grogan} and R. {Dahyot}}, \nbooktitle =  {2019 27th European Signal Processing Conference (EUSIPCO)},\ntitle =  {Patch-Based Colour Transfer with Optimal Transport}, \nurl = {https://www.eurasip.org/Proceedings/Eusipco/eusipco2019/Proceedings/papers/1570533179.pdf},\nnote = {Github: https://github.com/leshep/PCT_OT},\nabstract = {This paper proposes a new colour transfer method\nwith Optimal transport to transfer the colour of a source image to\nmatch the colour of a target image of the same scene. We propose\nto formulate the problem in higher dimensional spaces (than\ncolour spaces) by encoding overlapping neighborhoods of pixels\ncontaining colour information as well as spatial information.\nSince several recoloured candidates are now generated for each\npixel in the source image, we define an original procedure\nto efficiently merge these candidates which allows denoising\nand artifact removal as well as colour transfer. Experiments\nshow quantitative and qualitative improvements over previous\ncolour transfer methods. Our method can be applied to different\ncontexts of colour transfer such as transferring colour between\ndifferent camera models, camera settings, illumination conditions\nand colour retouch styles for photographs.},\nyear =  {2019}, \nvolume =  {}, \nnumber =  {}, \npages =  {1-5}, \nkeywords =  {optimal transport;colour transfer;image enhancement;JPEG compression blocks}, \ndoi =  {10.23919/EUSIPCO.2019.8902611},\nISSN =  {2219-5491}, month =  {Sep.}}\n\n
\n
\n\n\n
\n This paper proposes a new colour transfer method with Optimal transport to transfer the colour of a source image to match the colour of a target image of the same scene. We propose to formulate the problem in higher dimensional spaces (than colour spaces) by encoding overlapping neighborhoods of pixels containing colour information as well as spatial information. Since several recoloured candidates are now generated for each pixel in the source image, we define an original procedure to efficiently merge these candidates which allows denoising and artifact removal as well as colour transfer. Experiments show quantitative and qualitative improvements over previous colour transfer methods. Our method can be applied to different contexts of colour transfer such as transferring colour between different camera models, camera settings, illumination conditions and colour retouch styles for photographs.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Harmonic Networks with Limited Training Samples.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In 2019 27th European Signal Processing Conference (EUSIPCO), pages 1-5, Sep. 2019. \n Github: https://github.com/matej-ulicny/harmonic-networks and paper also on arxiv http://arxiv.org/abs/1905.00135 and https://www.eurasip.org/Proceedings/Eusipco/eusipco2019/Proceedings/papers/1570533913.pdf\n\n\n\n
\n\n\n\n \n \n \"HarmonicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8902831, \nauthor =  {M. {Ulicny} and V. A. {Krylov} and R. {Dahyot}}, \nbooktitle =  {2019 27th European Signal Processing Conference (EUSIPCO)},\nurl = {https://mural.maynoothuniversity.ie/15158/1/RD_harmonic%20networks.pdf},\ntitle =  {Harmonic Networks with Limited Training Samples}, \nyear =  {2019}, \nvolume =  {}, \nnumber =  {}, \npages =  {1-5}, \nabstract = {Convolutional neural networks (CNNs) are very popular nowadays for image processing. CNNs allow one to learn optimal filters in a (mostly) supervised machine learning context. However this typically requires abundant labelled training data to estimate the filter parameters. Alternative strategies have been deployed for reducing the number of parameters and / or filters to be learned and thus decrease overfitting. In the context of reverting to preset filters, we propose here a computationally efficient harmonic block that uses Discrete Cosine Transform (DCT) filters in CNNs. In this work we examine the performance of harmonic networks in limited training data scenario. We validate experimentally that its performance compares well against scattering networks that use wavelets as preset filters.},\nkeywords =  {Lapped Discrete Cosine Transform;harmonic network;convolutional filter;limited data}, \ndoi =  {10.23919/EUSIPCO.2019.8902831},\nnote = {Github: https://github.com/matej-ulicny/harmonic-networks and paper also on arxiv http://arxiv.org/abs/1905.00135 and https://www.eurasip.org/Proceedings/Eusipco/eusipco2019/Proceedings/papers/1570533913.pdf},\narchivePrefix  =  {arXiv},\neprint     =  {1905.00135},\nISSN =  {2219-5491}, \nmonth =  {Sep.}}\n\n\n
\n
\n\n\n
\n Convolutional neural networks (CNNs) are very popular nowadays for image processing. CNNs allow one to learn optimal filters in a (mostly) supervised machine learning context. However this typically requires abundant labelled training data to estimate the filter parameters. Alternative strategies have been deployed for reducing the number of parameters and / or filters to be learned and thus decrease overfitting. In the context of reverting to preset filters, we propose here a computationally efficient harmonic block that uses Discrete Cosine Transform (DCT) filters in CNNs. In this work we examine the performance of harmonic networks in limited training data scenario. We validate experimentally that its performance compares well against scattering networks that use wavelets as preset filters.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Super-Resolution on Degraded Low-Resolution Images Using Convolutional Neural Networks.\n \n \n \n \n\n\n \n Albluwi, F.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In 2019 27th European Signal Processing Conference (EUSIPCO), pages 1-5, Sep. 2019. \n Github: https://github.com/Fatma-Albluwi/DBSR\n\n\n\n
\n\n\n\n \n \n \"Super-ResolutionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8903000, \nauthor =  {F. {Albluwi} and V. A. {Krylov} and R. {Dahyot}}, \nbooktitle =  {2019 27th European Signal Processing Conference (EUSIPCO)}, \ntitle =  {Super-Resolution on Degraded Low-Resolution Images Using Convolutional Neural Networks}, \nyear =  {2019}, \nvolume =  {}, \nabstract = {Single Image Super-Resolution (SISR) has witnessed\na dramatic improvement in recent years through the use of deep\nlearning and, in particular, convolutional neural networks (CNN).\nIn this work we address reconstruction from low-resolution\nimages and consider as well degrading factors in images such as\nblurring. To address this challenging problem, we propose a new\narchitecture to tackle blur with the down-sampling of images by\nextending the DBSRCNN architecture. We validate our new\narchitecture (DBSR) experimentally against several state of the\nart super-resolution techniques.},\nnote = {Github: https://github.com/Fatma-Albluwi/DBSR},\nurl = {https://www.eurasip.org/Proceedings/Eusipco/eusipco2019/Proceedings/papers/1570533420.pdf},\nnumber =  {}, \npages =  {1-5}, \nkeywords =  {Image super-resolution;image deblurring;deep learning;CNN}, \ndoi =  {10.23919/EUSIPCO.2019.8903000}, \nISSN =  {2219-5491}, month =  {Sep.}}\n\n
\n
\n\n\n
\n Single Image Super-Resolution (SISR) has witnessed a dramatic improvement in recent years through the use of deep learning and, in particular, convolutional neural networks (CNN). In this work we address reconstruction from low-resolution images and consider as well degrading factors in images such as blurring. To address this challenging problem, we propose a new architecture to tackle blur with the down-sampling of images by extending the DBSRCNN architecture. We validate our new architecture (DBSR) experimentally against several state of the art super-resolution techniques.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Mini-Batch VLAD for Visual Place Retrieval.\n \n \n \n \n\n\n \n Aljuaidi, R.; Su, J.; and Dahyot, R.\n\n\n \n\n\n\n In 2019 30th Irish Signals and Systems Conference (ISSC), pages 1-6, June 2019. \n Awarded Best Student Paper at ISSC 2019. Github: https://github.com/ReemTCD/Mini_Batch_VLAD\n\n\n\n
\n\n\n\n \n \n \"Mini-BatchPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8904931, \nauthor =  {R. {Aljuaidi} and J. {Su} and R. {Dahyot}}, \nbooktitle =  {2019 30th Irish Signals and Systems Conference (ISSC)}, \ntitle =  {Mini-Batch VLAD for Visual Place Retrieval}, \nnote = {Awarded Best Student Paper at ISSC 2019. Github: https://github.com/ReemTCD/Mini_Batch_VLAD},\nyear =  {2019}, \nvolume =  {}, \nnumber =  {}, \npages =  {1-6}, \nabstract = {This study investigates the visual place retrieval of an image query using a geotagged image dataset.\nVector of Locally Aggregated Descriptors (VLAD) is one of\nthe local features that can be used for image place recognition.\nVLAD describes an image by the difference of its local feature\ndescriptors from an already computed codebook. Generally, a\nvisual codebook is generated from k-means clustering of the\ndescriptors. However, the dimensionality of visual features is\nnot trivial and the computational load of sample distances in\na large image dataset is challenging. In order to design an\naccurate image retrieval method with affordable computation\nexpenses, we propose to use the mini-batch k-means clustering\nto compute VLAD descriptor(MB-VLAD). The proposed MBVLAD technique shows advantage in retrieval accuracy in\ncomparison with the state of the art techniques.},\nkeywords =  {feature extraction;content-based image retrieval;image processing}, \ndoi =  {10.1109/ISSC.2019.8904931}, \nurl={https://mural.maynoothuniversity.ie/15129/1/RD_mini%20batch.pdf},\nISSN =  {2688-1446},\nmonth =  {June}}\n
\n
\n\n\n
\n This study investigates the visual place retrieval of an image query using a geotagged image dataset. Vector of Locally Aggregated Descriptors (VLAD) is one of the local features that can be used for image place recognition. VLAD describes an image by the difference of its local feature descriptors from an already computed codebook. Generally, a visual codebook is generated from k-means clustering of the descriptors. However, the dimensionality of visual features is not trivial and the computational load of sample distances in a large image dataset is challenging. In order to design an accurate image retrieval method with affordable computation expenses, we propose to use the mini-batch k-means clustering to compute VLAD descriptor(MB-VLAD). The proposed MBVLAD technique shows advantage in retrieval accuracy in comparison with the state of the art techniques.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Performance-Oriented Neural Architecture Search.\n \n \n \n \n\n\n \n Anderson, A.; Su, J.; Dahyot, R.; and Gregg, D.\n\n\n \n\n\n\n In 2019 International Conference on High Performance Computing Simulation (HPCS), pages 177-184, 2019. \n \n\n\n\n
\n\n\n\n \n \n \"Performance-OrientedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{9188213,\nauthor = {A. {Anderson} and J. {Su} and R. {Dahyot} and D. {Gregg}},\nbooktitle = {2019 International Conference on High Performance Computing   Simulation (HPCS)}, \ntitle = {Performance-Oriented Neural Architecture Search}, \nyear = {2019},\nurl = {https://arxiv.org/pdf/2001.02976.pdf},\nabstract = {Hardware-Software Co-Design is a highly successful strategy for improving performance of domain-specific computing systems. We argue for the application of the same methodology to deep learning; specifically, we propose to extend neural architecture search with information about the hardware to ensure that the model designs produced are highly efficient in addition to the typical criteria around accuracy. Using the task of keyword spotting in audio on edge computing devices, we demonstrate that our approach results in neural architecture that is not only highly accurate, but also efficiently mapped to the computing platform which will perform the inference. Using our modified neural architecture search, we demonstrate 0.88\\% increase in TOP-1 accuracy with 1.85× reduction in latency for keyword spotting in audio on an embedded SoC, and 1.59× on a high-end GPU.},\neprint =  {2001.02976}, \narchivePrefix =  {arXiv}, \nvolume = {},\nnumber = {},\npages = {177-184},\ndoi = {10.1109/HPCS48598.2019.9188213}}\n
\n
\n\n\n
\n Hardware-Software Co-Design is a highly successful strategy for improving performance of domain-specific computing systems. We argue for the application of the same methodology to deep learning; specifically, we propose to extend neural architecture search with information about the hardware to ensure that the model designs produced are highly efficient in addition to the typical criteria around accuracy. Using the task of keyword spotting in audio on edge computing devices, we demonstrate that our approach results in neural architecture that is not only highly accurate, but also efficiently mapped to the computing platform which will perform the inference. Using our modified neural architecture search, we demonstrate 0.88% increase in TOP-1 accuracy with 1.85× reduction in latency for keyword spotting in audio on an embedded SoC, and 1.59× on a high-end GPU.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Entropic Regularisation of Robust Optimal Transport.\n \n \n \n \n\n\n \n Dahyot, R.; Alghamdi, H.; and Grogan, M.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference 2019, volume abs/1905.12678, 2019. \n \n\n\n\n
\n\n\n\n \n \n \"EntropicPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{DBLP:journals/corr/abs-1905-12678,\nauthor     =  {Rozenn Dahyot and\nHana Alghamdi and\nMair{\\'{e}}ad Grogan},\ntitle      =  {Entropic Regularisation of Robust Optimal Transport},\nabstract = {Grogan et al have recently proposed a solution to colour transfer by minimising the Euclidean distance L2 between two probability density functions capturing the colour distributions of two images (palette and target). It was shown to be very competitive to alternative solutions based on Optimal Transport for colour transfer. We show that in fact Grogan et al's formulation can also be understood as a new robust Optimal Transport based framework with entropy regularisation over marginals.},\nbooktitle   =  {Irish Machine Vision and Image Processing conference 2019},\nvolume     =  {abs/1905.12678},\nyear       =  {2019},\nurl        =  {https://arxiv.org/pdf/1905.12678.pdf},\narchivePrefix  =  {arXiv},\ndoi = {10.21427/w611-mb37},\neprint     =  {1905.12678},\ntimestamp  =  {Mon, 03 Jun 2019 13:42:33 +0200},\nbiburl     =  {https://dblp.org/rec/bib/journals/corr/abs-1905-12678},\nbibsource  =  {dblp computer science bibliography, https://dblp.org},\n}
\n
\n\n\n
\n Grogan et al have recently proposed a solution to colour transfer by minimising the Euclidean distance L2 between two probability density functions capturing the colour distributions of two images (palette and target). It was shown to be very competitive to alternative solutions based on Optimal Transport for colour transfer. We show that in fact Grogan et al's formulation can also be understood as a new robust Optimal Transport based framework with entropy regularisation over marginals.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n AI Pipeline - bringing AI to you. End-to-end integration of data, algorithms and deployment tools.\n \n \n \n \n\n\n \n de Prado, M.; Su, J.; Dahyot, R.; Saeed, R.; Keller, L.; and Vállez, N.\n\n\n \n\n\n\n In HiPEAC 2019 workshop Emerging Deep Learning Accelerator, volume abs/1901.05049v1, 2019. \n \n\n\n\n
\n\n\n\n \n \n \"AIPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{DBLP:journals/corr/abs-1901-05049, \nauthor =  {Miguel de Prado and Jing Su and Rozenn Dahyot and Rabia Saeed and Lorenzo Keller and Noelia V{\\'{a}}llez}, \ntitle =  {{AI} Pipeline - bringing {AI} to you. End-to-end integration of data, algorithms and deployment tools},\nbooktitle =  {HiPEAC 2019 workshop Emerging Deep Learning Accelerator}, \nurl = {https://arxiv.org/pdf/1901.05049v1.pdf},\nabstract = {Next generation of embedded Information and Communication Technology (ICT) systems are interconnected collaborative intelligent systems able to perform autonomous tasks. Training and deployment\nof such systems on Edge devices however require a fine-grained integration of data and tools to\nachieve high accuracy and overcome functional and non-functional requirements.\nIn this work, we present a modular AI pipeline as an integrating framework to bring data, algorithms\nand deployment tools together. By these means, we are able to interconnect the different entities or\nstages of particular systems and provide an end-to-end development of AI products. We demonstrate\nthe effectiveness of the AI pipeline by solving an Automatic Speech Recognition challenge and we\nshow that all the steps leading to an end-to-end development for Key-word Spotting tasks: importing,\npartitioning and pre-processing of speech data, training of different neural network architectures and\ntheir deployment on heterogeneous embedded platforms.},\nvolume =  {abs/1901.05049v1}, \nyear =  {2019}, \narchivePrefix =  {arXiv}, \neprint =  {1901.05049v1}, \n}\n
\n
\n\n\n
\n Next generation of embedded Information and Communication Technology (ICT) systems are interconnected collaborative intelligent systems able to perform autonomous tasks. Training and deployment of such systems on Edge devices however require a fine-grained integration of data and tools to achieve high accuracy and overcome functional and non-functional requirements. In this work, we present a modular AI pipeline as an integrating framework to bring data, algorithms and deployment tools together. By these means, we are able to interconnect the different entities or stages of particular systems and provide an end-to-end development of AI products. We demonstrate the effectiveness of the AI pipeline by solving an Automatic Speech Recognition challenge and we show that all the steps leading to an end-to-end development for Key-word Spotting tasks: importing, partitioning and pre-processing of speech data, training of different neural network architectures and their deployment on heterogeneous embedded platforms.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Automatic detection of passable roads after floods in remote sensed and social media data.\n \n \n \n \n\n\n \n Ahmad, K.; Pogorelov, K.; Riegler, M.; Ostroukhova, O.; Halvorsen, P.; Conci, N.; and Dahyot, R.\n\n\n \n\n\n\n Signal Processing: Image Communication, 74: 110 - 118. 2019.\n Arxiv: https://arxiv.org/pdf/1901.03298.pdf\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{AHMAD2019110,\ntitle =  {Automatic detection of passable roads after floods in remote sensed and social media data}, \njournal =  {Signal Processing: Image Communication}, \nvolume =  {74},\npages =  {110 - 118}, \nyear =  {2019}, \nissn =  {0923-5965}, \ndoi =  {10.1016/j.image.2019.02.002},\nnote= {Arxiv: https://arxiv.org/pdf/1901.03298.pdf},\nurl =  {https://mural.maynoothuniversity.ie/15100/1/RD_signal.pdf}, \nauthor =  { Kashif Ahmad and Konstantin Pogorelov and Michael Riegler and Olga Ostroukhova and Pål Halvorsen and Nicola Conci and Rozenn Dahyot}, \nkeywords =  {Flood detection, Convolutional neural networks, Natural disasters, Social media, Satellite imagery, Multimedia indexing and retrieval}, \nabstract =  {This paper addresses the problem of floods classification and floods aftermath detection based on both social media and satellite imagery. Automatic detection of disasters such as floods is still a very challenging task. The focus lies on identifying passable routes or roads during floods. Two novel solutions are presented, which were developed for two corresponding tasks at the MediaEval 2018 benchmarking challenge. The tasks are (i) identification of images providing evidence for road passability and (ii) differentiation and detection of passable and non-passable roads in images from two complementary sources of information. For the first challenge, we mainly rely on object and scene-level features extracted through multiple deep models pre-trained on the ImageNet and Places datasets. The object and scene-level features are then combined using early, late and double fusion techniques. To identify whether or not it is possible for a vehicle to pass a road in satellite images, we rely on Convolutional Neural Networks and a transfer learning-based classification approach. The evaluation of the proposed methods is carried out on the large-scale datasets provided for the benchmark competition. The results demonstrate significant improvement in the performance over the recent state-of-art approaches.}}\n\n
\n
\n\n\n
\n This paper addresses the problem of floods classification and floods aftermath detection based on both social media and satellite imagery. Automatic detection of disasters such as floods is still a very challenging task. The focus lies on identifying passable routes or roads during floods. Two novel solutions are presented, which were developed for two corresponding tasks at the MediaEval 2018 benchmarking challenge. The tasks are (i) identification of images providing evidence for road passability and (ii) differentiation and detection of passable and non-passable roads in images from two complementary sources of information. For the first challenge, we mainly rely on object and scene-level features extracted through multiple deep models pre-trained on the ImageNet and Places datasets. The object and scene-level features are then combined using early, late and double fusion techniques. To identify whether or not it is possible for a vehicle to pass a road in satellite images, we rely on Convolutional Neural Networks and a transfer learning-based classification approach. The evaluation of the proposed methods is carried out on the large-scale datasets provided for the benchmark competition. The results demonstrate significant improvement in the performance over the recent state-of-art approaches.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Denoising RENOIR Image Dataset with DBSR.\n \n \n \n \n\n\n \n Albluwi, F.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing (IMVIP 2019), volume ISBN 978-0-9934207-4-0, pages 76-79, Technological University Dublin, 28-30 August 2019. \n \n\n\n\n
\n\n\n\n \n \n \"DenoisingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{IMVIP2019Albluwi, \ntitle =  {Denoising RENOIR Image Dataset with DBSR}, \nauthor =  {Fatma Albluwi and Vladimir A. Krylov and R. Dahyot},\nabstract = {Noise reduction algorithms have often been evaluated using images degraded by artificially synthesised\nnoise. The RENOIR image dataset  provides an alternative way for testing noise reduction algorithms\non real noisy images and we propose in this paper to assess our CNN called De-Blurring Super-Resolution\n(DBSR)  to reduce the natural noise due to low light conditions in a RENOIR dataset.},\nbooktitle =  {Irish Machine Vision and Image Processing (IMVIP 2019)}, \naddress =  {Technological University Dublin}, month =  {28-30 August}, \nyear =  {2019}, \nurl = {https://arrow.tudublin.ie/cgi/viewcontent.cgi?article = 1006&context = impstwo},\ndoi = {10.21427/g34k-8r27},\npages =  {76-79}, \nvolume =  {ISBN 978-0-9934207-4-0}}\n\n
\n
\n\n\n
\n Noise reduction algorithms have often been evaluated using images degraded by artificially synthesised noise. The RENOIR image dataset provides an alternative way for testing noise reduction algorithms on real noisy images and we propose in this paper to assess our CNN called De-Blurring Super-Resolution (DBSR) to reduce the natural noise due to low light conditions in a RENOIR dataset.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Harmonic Networks for Image Classification.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V.; and Dahyot, R.\n\n\n \n\n\n\n In British Machine Vision Conference (BMVC), Cardiff UK, 9-12 September 2019. \n Github: https://github.com/matej-ulicny/harmonic-networks\n\n\n\n
\n\n\n\n \n \n \"HarmonicPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{BMVC2019,\ntitle =  {Harmonic Networks for Image Classification}, \nauthor =  {M. Ulicny and V. Krylov and R. Dahyot}, \nbooktitle =  {British Machine Vision Conference (BMVC)}, \naddress =  {Cardiff UK}, \nmonth =  {9-12 September},\nabstract = {Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. In contrast, in this paper we propose harmonic blocks that\nproduce features by learning optimal combinations of responses to preset spectral filters.\nWe rely on the use of the Discrete Cosine Transform filters which have excellent energy\ncompaction properties and are widely used for image compression. \nThe proposed harmonic blocks are intended to replace conventional convolutional layers to produce partially or fully harmonic versions of new or existing CNN architectures. We demonstrate\nhow the harmonic networks can be efficiently compressed by exploiting redundancy in\nspectral domain and truncating high-frequency information. We extensively validate our\napproach and show that the introduction of harmonic blocks into state-of-the-art CNN\nmodels results in improved classification performance on CIFAR and ImageNet datasets.},\nurl = {https://bmvc2019.org/wp-content/uploads/papers/0628-paper.pdf},\nnote = {Github: https://github.com/matej-ulicny/harmonic-networks},\nyear =  {2019}}\n
\n
\n\n\n
\n Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. In contrast, in this paper we propose harmonic blocks that produce features by learning optimal combinations of responses to preset spectral filters. We rely on the use of the Discrete Cosine Transform filters which have excellent energy compaction properties and are widely used for image compression. The proposed harmonic blocks are intended to replace conventional convolutional layers to produce partially or fully harmonic versions of new or existing CNN architectures. We demonstrate how the harmonic networks can be efficiently compressed by exploiting redundancy in spectral domain and truncating high-frequency information. We extensively validate our approach and show that the introduction of harmonic blocks into state-of-the-art CNN models results in improved classification performance on CIFAR and ImageNet datasets.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n L2 Divergence for robust colour transfer.\n \n \n \n \n\n\n \n Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n Computer Vision and Image Understanding. 2019.\n Github: https://github.com/groganma/gmm-colour-transfer\n\n\n\n
\n\n\n\n \n \n \"L2Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{GROGAN2019, \ntitle =  {L2 Divergence for robust colour transfer}, \njournal =  {Computer Vision and Image Understanding},\nyear =  {2019},\nnote = {Github: https://github.com/groganma/gmm-colour-transfer},\nissn =  {1077-3142},\ndoi =  {10.1016/j.cviu.2019.02.002}, \nurl =  {https://mural.maynoothuniversity.ie/15103/1/RB_L2.pdf}, \nauthor =  {Mairead Grogan and Rozenn Dahyot}, \nkeywords =  {Colour Transfer, L2 Registration, Re-colouring, Colour Grading}, \nabstract =  {Optimal Transport (OT) is a very popular framework for performing colour transfer \nin images and videos. We have proposed an alternative framework where the cost function used for \ninferring a parametric transfer function is defined as the robust L2 divergence between two \nprobability density functions. In this paper, we show that our approach combines many advantages \nof state of the art techniques and outperforms many recent algorithms as measured quantitatively \nwith standard quality metrics, and qualitatively using perceptual studies. Mathematically, our \nformulation is presented in contrast to the OT cost function that shares similarities with our cost function. \nOur formulation, however, is more flexible as it allows colour correspondences that may be available to be taken \ninto account and performs well despite potential occurrences of correspondence outlier pairs. Our algorithm is shown to be \nfast, robust and it easily allows for user interaction providing freedom for artists to fine tune the recoloured images and videos.}}\n\n
\n
\n\n\n
\n Optimal Transport (OT) is a very popular framework for performing colour transfer in images and videos. We have proposed an alternative framework where the cost function used for inferring a parametric transfer function is defined as the robust L2 divergence between two probability density functions. In this paper, we show that our approach combines many advantages of state of the art techniques and outperforms many recent algorithms as measured quantitatively with standard quality metrics, and qualitatively using perceptual studies. Mathematically, our formulation is presented in contrast to the OT cost function that shares similarities with our cost function. Our formulation, however, is more flexible as it allows colour correspondences that may be available to be taken into account and performs well despite potential occurrences of correspondence outlier pairs. Our algorithm is shown to be fast, robust and it easily allows for user interaction providing freedom for artists to fine tune the recoloured images and videos.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Object Geolocation from Crowdsourced Street Level Imagery.\n \n \n \n \n\n\n \n Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In Alzate, C.; Monreale, A.; Assem, H.; Bifet, A.; Buda, T. S.; Caglayan, B.; Drury, B.; García-Martín, Eva; Gavaldà, R.; Kramer, S.; Lavesson, N.; Madden, M.; Molloy, I.; Nicolae, M.; and Sinn, M., editor(s), ECML PKDD 2018 Workshops, pages 79–83, Cham, 2019. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n \n \"ObjectPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@InProceedings{10.1007/978-3-030-13453-2_7, \nauthor =  {Krylov, Vladimir A. and Dahyot, Rozenn}, \neditor =  {Alzate, Carlos\nand Monreale, Anna\nand Assem, Haytham\nand Bifet, Albert\nand Buda, Teodora Sandra\nand Caglayan, Bora\nand Drury, Brett\nand Garc{\\'i}a-Mart{\\'i}n, Eva\nand Gavald{\\`a}, Ricard\nand Kramer, Stefan\nand Lavesson, Niklas\nand Madden, Michael\nand Molloy, Ian\nand Nicolae, Maria-Irina\nand Sinn, Mathieu},\ndoi =  {10.1007/978-3-030-13453-2_7}, \nurl = {https://mural.maynoothuniversity.ie/15249/1/RD_object.pdf},\ntitle =  {Object Geolocation from Crowdsourced Street Level Imagery}, \nbooktitle =  {ECML PKDD 2018 Workshops}, \nyear =  {2019}, \npublisher =  {Springer International Publishing}, \naddress =  {Cham}, \npages =  {79--83}, \nabstract =  {We explore the applicability and limitations of a state-of-the-art object \ndetection and geotagging system [4] applied to crowdsourced image data. Our experiments with \nimagery from Mapillary crowdsourcing platform demonstrate that with increasing amount of images,\nthe detection accuracy is getting close to that obtained with high-end street level data. Nevertheless,\ndue to excessive camera position noise, the estimated geolocation (position) of the detected object is \nless accurate on crowdsourced Mapillary imagery than with high-end street level imagery obtained by Google Street View.},\nisbn =  {978-3-030-13453-2}}\n\n
\n
\n\n\n
\n We explore the applicability and limitations of a state-of-the-art object detection and geotagging system [4] applied to crowdsourced image data. Our experiments with imagery from Mapillary crowdsourcing platform demonstrate that with increasing amount of images, the detection accuracy is getting close to that obtained with high-end street level data. Nevertheless, due to excessive camera position noise, the estimated geolocation (position) of the detected object is less accurate on crowdsourced Mapillary imagery than with high-end street level imagery obtained by Google Street View.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Using WGAN for Improving Imbalanced Classification Performance.\n \n \n \n \n\n\n \n Bhatia, S.; and Dahyot, R.\n\n\n \n\n\n\n In Curry, E.; Keane, M.; Ojo, A.; and Salwala, D., editor(s), 27th Irish Conference on Artificial Intelligence and Cognitive Science, pages 365-375, Galway, Ireland, 2019. \n \n\n\n\n
\n\n\n\n \n \n \"UsingPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Bhatia2019, \ntitle =  {Using WGAN for Improving Imbalanced  Classification Performance}, \nauthor =  {S. Bhatia and R. Dahyot}, \nbooktitle =  {27th Irish Conference on Artificial Intelligence and Cognitive Science}, \naddress =  {Galway, Ireland}, \nissn =  {1613-0073},\nyear =  {2019}, \neditor =  {Edward Curry and Mark Keane and Adegboyega Ojo and Dhaval Salwala}, \npages =  {365-375}, \nabstract = {This paper investigates data synthesis with a Generative Adversarial Network (GAN) for augmenting the amount of data used for\ntraining classifiers (in supervised learning) to compensate for class imbalance (when the classes are not represented equally by the same number of\ntraining samples). Our data synthesis approach with GAN is compared\nwith data augmentation in the context of image classification. Our experimental results show encouraging results in comparison to standard\ndata augmentation schemes based on image transforms.},\nurl =  {http://ceur-ws.org/Vol-2563/aics_34.pdf}}\n\n
\n
\n\n\n
\n This paper investigates data synthesis with a Generative Adversarial Network (GAN) for augmenting the amount of data used for training classifiers (in supervised learning) to compensate for class imbalance (when the classes are not represented equally by the same number of training samples). Our data synthesis approach with GAN is compared with data augmentation in the context of image classification. Our experimental results show encouraging results in comparison to standard data augmentation schemes based on image transforms.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2018\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Object Geolocation Using MRF Based Multi-Sensor Fusion.\n \n \n \n \n\n\n \n Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 2745-2749, Oct 2018. \n \n\n\n\n
\n\n\n\n \n \n \"ObjectPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8451458,\nauthor =  {V. A. Krylov and R. Dahyot},\nbooktitle =  {2018 25th IEEE International Conference on Image Processing (ICIP)}, \ntitle =  {Object Geolocation Using MRF Based Multi-Sensor Fusion}, \nyear =  {2018}, \nabstract = {Abundant image and sensory data collected over the last\ndecades represents an invaluable source of information for\ncataloging and monitoring of the environment. Fusion of heterogeneous data sources is a challenging but promising tool\nto efficiently leverage such information. In this work we propose a pipeline for automatic detection and geolocation of\nrecurring stationary objects deployed on fusion scenario of\nstreet level imagery and LiDAR point cloud data. The objects\nare geolocated coherently using a fusion procedure formalized as a Markov random field problem. This allows us to efficiently combine information from object segmentation, triangulation, monocular depth estimation and position matching with LiDAR data. The proposed fusion approach produces object mappings robust to scenes reporting multiple\nobject instances. We introduce a new challenging dataset of\nover 200 traffic lights in Dublin city centre and demonstrate\nhigh performance of the proposed methodology and its capacity to perform multi-sensor data fusion.},\nvolume =  {}, \nnumber =  {}, \npages =  {2745-2749}, \nkeywords =  {Laser radar;Three-dimensional displays;Cameras;Geology;Roads;Pipelines;Image segmentation;Object geolocation;street level imagery;LiDAR data;Markov random fields;traffic lights}, \nurl = {https://mural.maynoothuniversity.ie/15253/1/RD_object%20geolocation.pdf},\ndoi =  {10.1109/ICIP.2018.8451458},\nISSN =  {2381-8549}, \nmonth =  {Oct}}\n
\n
\n\n\n
\n Abundant image and sensory data collected over the last decades represents an invaluable source of information for cataloging and monitoring of the environment. Fusion of heterogeneous data sources is a challenging but promising tool to efficiently leverage such information. In this work we propose a pipeline for automatic detection and geolocation of recurring stationary objects deployed on fusion scenario of street level imagery and LiDAR point cloud data. The objects are geolocated coherently using a fusion procedure formalized as a Markov random field problem. This allows us to efficiently combine information from object segmentation, triangulation, monocular depth estimation and position matching with LiDAR data. The proposed fusion approach produces object mappings robust to scenes reporting multiple object instances. We introduce a new challenging dataset of over 200 traffic lights in Dublin city centre and demonstrate high performance of the proposed methodology and its capacity to perform multi-sensor data fusion.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Image Deblurring and Super-Resolution Using Deep Convolutional Neural Networks.\n \n \n \n \n\n\n \n Albluwi, F.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n In 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), pages 1-6, Sept 2018. \n Github: https://github.com/Fatma-Albluwi/DBSRCNN\n\n\n\n
\n\n\n\n \n \n \"ImagePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{8516983, \nauthor =  {F. Albluwi and V. A. Krylov and R. Dahyot},\nbooktitle =  {2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP)}, \ntitle =  {Image Deblurring and Super-Resolution Using Deep Convolutional Neural Networks}, \nyear =  {2018},\nvolume =  {}, \nnumber =  {}, \npages =  {1-6},\nabstract = {Recently multiple high performance algorithms have been developed to infer high-resolution images from low-resolution\nimage input using deep learning algorithms. The related\nproblem of super-resolution from blurred or corrupted lowresolution images has however received much less attention.\nIn this work, we propose a new deep learning approach that\nsimultaneously addresses deblurring and super-resolution\nfrom blurred low resolution images. We evaluate the stateof-the-art super-resolution convolutional neural network (SRCNN)\narchitecture proposed in [1] for the blurred reconstruction scenario and propose a revised deeper architecture that\nproves its superiority experimentally both when the levels of\nblur are known and unknown a priori.},\nnote = {Github: https://github.com/Fatma-Albluwi/DBSRCNN},\nkeywords =  {convolution;image reconstruction;image resolution;image restoration;learning (artificial intelligence);neural nets;image deblurring;deep convolutional neural networks;multiple high performance algorithms;high-resolution images;low-resolution image input;deep learning algorithms;low-resolution images;deep learning approach;blurred low resolution images;super-resolution convolutional neural network;Training;Image resolution;Pipelines;Image reconstruction;Signal resolution;Feature extraction;Convolutional neural networks;Image super-resolution;deblurring;deep learning;convolutional neural networks}, \ndoi =  {10.1109/MLSP.2018.8516983},\nurl={https://mural.maynoothuniversity.ie/15254/1/RD_image.pdf},\nISSN =  {1551-2541}, month =  {Sept}}\n\n
\n
\n\n\n
\n Recently multiple high performance algorithms have been developed to infer high-resolution images from low-resolution image input using deep learning algorithms. The related problem of super-resolution from blurred or corrupted lowresolution images has however received much less attention. In this work, we propose a new deep learning approach that simultaneously addresses deblurring and super-resolution from blurred low resolution images. We evaluate the stateof-the-art super-resolution convolutional neural network (SRCNN) architecture proposed in [1] for the blurred reconstruction scenario and propose a revised deeper architecture that proves its superiority experimentally both when the levels of blur are known and unknown a priori.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Automatic Discovery and Geotagging of Objects from Street View Imagery.\n \n \n \n \n\n\n \n Krylov, V.; Kenny, E.; and Dahyot, R.\n\n\n \n\n\n\n Remote Sensing, 10(5): 661. April 2018.\n Github: https://github.com/vlkryl/streetview_objectmapping - URI: http://hdl.handle.net/2262/89654 - Arxiv: https://arxiv.org/pdf/1708.08417.pdf\n\n\n\n
\n\n\n\n \n \n \"AutomaticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 7 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{Krylov_2018,\ndoi  =  {10.3390/rs10050661},\nurl  =  {https://www.mdpi.com/2072-4292/10/5/661/pdf?version=1525349637},\nnote = {Github: https://github.com/vlkryl/streetview_objectmapping - URI: http://hdl.handle.net/2262/89654 - Arxiv: https://arxiv.org/pdf/1708.08417.pdf},\nabstract = {Many applications such as autonomous navigation, urban planning and asset monitoring, rely on the availability of accurate information about objects and their geolocations. In this paper we propose to automatically detect and compute the GPS coordinates of recurring stationary objects of interest using street view imagery. Our processing pipeline relies on two fully convolutional neural networks: the first segments objects in the images while the second estimates their distance from the camera. To geolocate all the detected objects coherently we propose a novel custom Markov Random Field model to perform objects triangulation. The novelty of the resulting pipeline is the combined use of monocular depth estimation and triangulation to enable automatic mapping of complex scenes with multiple visually similar objects of interest. We validate experimentally the effectiveness of our approach on two object classes: traffic lights and telegraph poles. The experiments report high object recall rates and GPS accuracy within 2 meters, which is comparable with the precision of single-frequency GPS receivers.},\nyear  =  2018,\npublisher  =  {{MDPI} {AG}},\nvolume  =  {10},\nnumber  =  {5},\npages  =  {661},\nauthor  =  {Vladimir Krylov and Eamonn Kenny and Rozenn Dahyot},\ntitle  =  {Automatic Discovery and Geotagging of Objects from Street View Imagery},\njournal  =  {Remote Sensing},\nmonth  =  {April},\n}
\n
\n\n\n
\n Many applications such as autonomous navigation, urban planning and asset monitoring, rely on the availability of accurate information about objects and their geolocations. In this paper we propose to automatically detect and compute the GPS coordinates of recurring stationary objects of interest using street view imagery. Our processing pipeline relies on two fully convolutional neural networks: the first segments objects in the images while the second estimates their distance from the camera. To geolocate all the detected objects coherently we propose a novel custom Markov Random Field model to perform objects triangulation. The novelty of the resulting pipeline is the combined use of monocular depth estimation and triangulation to enable automatic mapping of complex scenes with multiple visually similar objects of interest. We validate experimentally the effectiveness of our approach on two object classes: traffic lights and telegraph poles. The experiments report high object recall rates and GPS accuracy within 2 meters, which is comparable with the precision of single-frequency GPS receivers.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n 3D point cloud segmentation using GIS.\n \n \n \n \n\n\n \n Liu, C.; Vladimir, K.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2018), volume e-book of proceedings with ISBN 978-0-9934207-3-3, Ulster University, Northern Ireland, 2018. \n http://hdl.handle.net/2262/89508\n\n\n\n
\n\n\n\n \n \n \"3DPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{LiuIMVIP2018, \ntitle =  {3D point cloud segmentation using GIS},\nauthor =  {C.-J. Liu and K. Vladimir and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2018)}, \nyear =  {2018},\nvolume =  {e-book of proceedings with ISBN 978-0-9934207-3-3},\nurl =  {https://arxiv.org/pdf/2108.06306.pdf}, \nnote = {http://hdl.handle.net/2262/89508},\naddress =  {Ulster University, Northern Ireland}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Harmonic Networks: Integrating Spectral Information into CNNs.\n \n \n \n \n\n\n \n Ulicny, M.; Krylov, V. A.; and Dahyot, R.\n\n\n \n\n\n\n Technical Report Trinity College Dublin Ireland, 2018.\n Github: https://github.com/matej-ulicny/harmonic-networks\n\n\n\n
\n\n\n\n \n \n \"HarmonicPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{DBLP:journals/corr/abs-1812-03205, \nauthor =  {Matej Ulicny and Vladimir A. Krylov and Rozenn Dahyot}, \ntitle =  {Harmonic Networks: Integrating Spectral Information into CNNs}, \ninstitution =  {Trinity College Dublin Ireland},\nvolume =  {abs/1812.03205}, \nyear =  {2018},\nnote = {Github: https://github.com/matej-ulicny/harmonic-networks},\nabstract = {Convolutional neural networks (CNNs) learn filters in order to \ncapture local correlation patterns in feature space. In contrast, in this paper \nwe propose harmonic blocks that produce features by learning optimal combinations \nof spectral filters defined by the Discrete Cosine Transform. The harmonic blocks \nare used to replace conventional convolutional layers to construct partial or fully harmonic CNNs. \nWe extensively validate our approach and show that the introduction of harmonic blocks into state-of-the-art \nCNN baseline architectures results in comparable or better performance in classification tasks on small NORB, CIFAR10 and CIFAR100 datasets.},\nurl =  {https://arxiv.org/pdf/1812.03205.pdf}, \narchivePrefix =  {arXiv},\neprint =  {1812.03205} \n}\n\n
\n
\n\n\n
\n Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. In contrast, in this paper we propose harmonic blocks that produce features by learning optimal combinations of spectral filters defined by the Discrete Cosine Transform. The harmonic blocks are used to replace conventional convolutional layers to construct partial or fully harmonic CNNs. We extensively validate our approach and show that the introduction of harmonic blocks into state-of-the-art CNN baseline architectures results in comparable or better performance in classification tasks on small NORB, CIFAR10 and CIFAR100 datasets.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Shape registration with directional data.\n \n \n \n \n\n\n \n Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n Pattern Recognition, 79: 452 - 466. 2018.\n \n\n\n\n
\n\n\n\n \n \n \"ShapePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 3 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{GROGAN2018452, \ntitle =  {Shape registration with directional data},\njournal =  {Pattern Recognition}, \nvolume =  {79},\npages =  {452 - 466},\nyear =  {2018},\nissn =  {0031-3203},\nabstract = {We propose several cost functions for registration of \nshapes encoded with Euclidean and/or non-Euclidean information (unit vectors). \nOur framework is assessed for estimation of both rigid and non-rigid transformations between \nthe target and model shapes corresponding to 2D contours and 3D surfaces. The experimental results obtained \nconfirm that using the combination of a point's position and unit normal vector in a cost function can enhance \nthe registration results compared to state of the art methods.},\ndoi =  {10.1016/j.patcog.2018.02.021},\nurl =  {https://arxiv.org/pdf/1708.07791.pdf},\nauthor =  {Mairead Grogan and Rozenn Dahyot}, \nkeywords =  {Shape registration, Directional information, Von Mises-Fisher,  registration}}\n\n
\n
\n\n\n
\n We propose several cost functions for registration of shapes encoded with Euclidean and/or non-Euclidean information (unit vectors). Our framework is assessed for estimation of both rigid and non-rigid transformations between the target and model shapes corresponding to 2D contours and 3D surfaces. The experimental results obtained confirm that using the combination of a point's position and unit normal vector in a cost function can enhance the registration results compared to state of the art methods.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2017\n \n \n (9)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n BONSEYES: Platform for Open Development of Systems of Artificial Intelligence: Invited Paper.\n \n \n \n \n\n\n \n Llewellynn, T.; Fernández-Carrobles, M. M.; Deniz, O.; Fricker, S.; Storkey, A.; Pazos, N.; Velikic, G.; Leufgen, K.; Dahyot, R.; Koller, S.; Goumas, G.; Leitner, P.; Dasika, G.; Wang, L.; and Tutschku, K.\n\n\n \n\n\n\n In Proceedings of the Computing Frontiers Conference, of CF'17, pages 299–304, New York, NY, USA, 2017. ACM\n \n\n\n\n
\n\n\n\n \n \n \"BONSEYES:Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{Llewellynn:2017:BPO:3075564.3076259,\nauthor =  {Llewellynn, Tim and Fern\\'{a}ndez-Carrobles, M. Milagro and Deniz, Oscar and Fricker, Samuel and Storkey, Amos and Pazos, Nuria and Velikic, Gordana and Leufgen, Kirsten and Dahyot, Rozenn and Koller, Sebastian and Goumas, Georgios and Leitner, Peter and Dasika, Ganesh and Wang, Lei and Tutschku, Kurt}, \ntitle =  {BONSEYES: Platform for Open Development of Systems of Artificial Intelligence: Invited Paper}, \nbooktitle =  {Proceedings of the Computing Frontiers Conference}, \nseries =  {CF'17}, \nyear =  {2017}, \nisbn =  {978-1-4503-4487-6}, \nlocation =  {Siena, Italy}, \npages =  {299--304}, \nnumpages =  {6}, \nurl =  {http://dl.acm.org/ft_gateway.cfm?id=3076259&type=pdf}, \ndoi =  {10.1145/3075564.3076259},\nabstract = {The Bonseyes EU H2020 collaborative project aims to develop a platform consisting of a Data Marketplace, \na Deep Learning Toolbox, and Developer Reference Platforms for organizations wanting to adopt Artificial Intelligence. \nThe project will be focused on using artificial intelligence in low power Internet of Things (IoT) devices ("edge computing"), \nembedded computing systems, and data center servers ("cloud computing"). It will bring about orders of magnitude improvements in efficiency,\nperformance, reliability, security, and productivity in the design and programming of systems of artificial intelligence \nthat incorporate Smart Cyber-Physical Systems (CPS). In addition, it will solve a causality problem for organizations who lack \naccess to Data and Models. Its open software architecture will facilitate adoption of the whole concept on a wider scale.\nTo evaluate the effectiveness, technical feasibility, and to quantify the real-world improvements in efficiency, security, performance,\neffort and cost of adding AI to products and services using the Bonseyes platform, four complementary demonstrators will be built. \nBonseyes platform capabilities are aimed at being aligned with the European FI-PPP activities and take advantage of its flagship project FIWARE.\nThis paper provides a description of the project motivation, goals and preliminary work.},\nacmid =  {3076259}, \npublisher =  {ACM},\naddress =  {New York, NY, USA}, \nkeywords =  {Data marketplace, Deep Learning, Internet of things, Smart Cyber-Physical Systems}}\n\n
\n
\n\n\n
\n The Bonseyes EU H2020 collaborative project aims to develop a platform consisting of a Data Marketplace, a Deep Learning Toolbox, and Developer Reference Platforms for organizations wanting to adopt Artificial Intelligence. The project will be focused on using artificial intelligence in low power Internet of Things (IoT) devices (\"edge computing\"), embedded computing systems, and data center servers (\"cloud computing\"). It will bring about orders of magnitude improvements in efficiency, performance, reliability, security, and productivity in the design and programming of systems of artificial intelligence that incorporate Smart Cyber-Physical Systems (CPS). In addition, it will solve a causality problem for organizations who lack access to Data and Models. Its open software architecture will facilitate adoption of the whole concept on a wider scale. To evaluate the effectiveness, technical feasibility, and to quantify the real-world improvements in efficiency, security, performance, effort and cost of adding AI to products and services using the Bonseyes platform, four complementary demonstrators will be built. Bonseyes platform capabilities are aimed at being aligned with the European FI-PPP activities and take advantage of its flagship project FIWARE. This paper provides a description of the project motivation, goals and preliminary work.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n On Using CNN with Compressed (DCT Based) Image Data.\n \n \n \n \n\n\n \n Ulicny, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2017), volume e-book of proceedings with ISBN 978-0-9934207-2-6, pages 44-51, Maynooth University, 2017. \n http://mural.maynoothuniversity.ie/8841/\n\n\n\n
\n\n\n\n \n \n \"OnPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{MatejIMVIP2017, \ntitle =  {On Using CNN with Compressed (DCT Based) Image Data}, \nauthor =  {M. Ulicny and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2017)}, \nyear =  {2017}, \npages =  {44-51}, \nvolume =  {e-book of proceedings with ISBN 978-0-9934207-2-6},\nurl =  {http://mural.maynoothuniversity.ie/8841/1/IMVIP2017_Proceedings.pdf},\nnote = {http://mural.maynoothuniversity.ie/8841/},\nabstract = {This paper investigates the use of Convolutional Neural Networks (CNN) to classify images encoded\nin compressible form using Discrete Cosine Tranform (DCT) as an alternative to raw image format. We\nshow experimentally that DCT features, that are directly available from JPEG format for instance, can be\nprocessed as efficiently as raw image data using the same CNN architectures.},\naddress =  {Maynooth University}}\n
\n
\n\n\n
\n This paper investigates the use of Convolutional Neural Networks (CNN) to classify images encoded in compressible form using Discrete Cosine Tranform (DCT) as an alternative to raw image format. We show experimentally that DCT features, that are directly available from JPEG format for instance, can be processed as efficiently as raw image data using the same CNN architectures.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Stitching Skin Images of Scars.\n \n \n \n \n\n\n \n Zolanvari, S. M. I.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2017), volume e-book of proceedings with ISBN 978-0-9934207-2-6, pages 265-268, Maynooth University, 2017. \n http://mural.maynoothuniversity.ie/8841/\n\n\n\n
\n\n\n\n \n \n \"StitchingPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{ImanIMVIP2017, \ntitle =  {Stitching Skin Images of Scars}, \nauthor =  {S. M. I. Zolanvari and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2017)}, \nyear =  {2017}, \npages =  {265-268}, \nvolume =  {e-book of proceedings with ISBN 978-0-9934207-2-6},\nurl =  {http://mural.maynoothuniversity.ie/8841/1/IMVIP2017_Proceedings.pdf},\nnote = {http://mural.maynoothuniversity.ie/8841/},\nabstract = {This paper introduces an automatic procedure for aligning and stitching the medical images of skin scars\nthat have the various amount of overlapping into one single registered image. The alignment procedure is\nbased on the rigid transformation of the pair of images regarding detected matched features. The proposed\npaper compares four different feature detection methods and evaluates the methods on several clinical cases.\nFor each case, the initial image is divided into four smaller sub-images with the different dimension. The result\nshows that the Harris Corner Detector algorithm achieves nearly 99\\% accurate result with the minimum\noverlapping of 160 pixels as the fastest method.},\naddress =  {Maynooth University}}\n
\n
\n\n\n
\n This paper introduces an automatic procedure for aligning and stitching the medical images of skin scars that have the various amount of overlapping into one single registered image. The alignment procedure is based on the rigid transformation of the pair of images regarding detected matched features. The proposed paper compares four different feature detection methods and evaluates the methods on several clinical cases. For each case, the initial image is divided into four smaller sub-images with the different dimension. The result shows that the Harris Corner Detector algorithm achieves nearly 99% accurate result with the minimum overlapping of 160 pixels as the fastest method.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n IDT Vs L2 Distance for Point Set Registration.\n \n \n \n \n\n\n \n Alghamdi, H.; Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2017), volume e-book of proceedings with ISBN 978-0-9934207-2-6, pages 91-98, Maynooth University, 2017. \n http://mural.maynoothuniversity.ie/8841/\n\n\n\n
\n\n\n\n \n \n \"IDTPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{HanaIMVIP2017, \ntitle =  {IDT Vs L2 Distance for Point Set Registration}, \nauthor =  {H. Alghamdi and M. Grogan and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2017)}, \nyear =  {2017}, \npages =  {91-98}, \nvolume =  {e-book of proceedings with ISBN 978-0-9934207-2-6},\nurl =  {http://mural.maynoothuniversity.ie/8841/1/IMVIP2017_Proceedings.pdf},\nnote = {http://mural.maynoothuniversity.ie/8841/},\nabstract = {Registration techniques have many applications such as 3D scans alignment, panoramic image mosaic\ncreation or shape matching. This paper focuses on (2D) point cloud registration using novel iterative algorithms\nthat are inspired by the Iterative Distribution Transfer (IDT) algorithm originally proposed to solve\ncolour transfer [Pitié et al., 2005, Pitié et al., 2007]. We propose three variants to IDT algorithm that we\ncompare with the standard L2 shape registration technique [Jian and Vemuri, 2011]. We show that our IDT\nalgorithms perform well against L2 for finding correspondences between model and target shapes.},\naddress =  {Maynooth University}}\n
\n
\n\n\n
\n Registration techniques have many applications such as 3D scans alignment, panoramic image mosaic creation or shape matching. This paper focuses on (2D) point cloud registration using novel iterative algorithms that are inspired by the Iterative Distribution Transfer (IDT) algorithm originally proposed to solve colour transfer [Pitié et al., 2005, Pitié et al., 2007]. We propose three variants to IDT algorithm that we compare with the standard L2 shape registration technique [Jian and Vemuri, 2011]. We show that our IDT algorithms perform well against L2 for finding correspondences between model and target shapes.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Populating virtual cities using social media.\n \n \n \n \n\n\n \n Bulbul, A.; and Dahyot, R.\n\n\n \n\n\n\n Computer Animation and Virtual Worlds, 28(5): e1742. 2017.\n \n\n\n\n
\n\n\n\n \n \n \"PopulatingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{BulbulCAVW2016, \nauthor =  {Bulbul, Abdullah and Dahyot, Rozenn},\ntitle =  {Populating virtual cities using social media},\njournal =  {Computer Animation and Virtual Worlds}, \nvolume =  {28}, \nnumber =  {5}, \npages =  {e1742}, \nyear =  {2017}, \nkeywords =  {computer animation, crowd simulations, social media, virtual worlds}, \ndoi =  {10.1002/cav.1742}, \nurl =  {https://roznn.github.io/PDF/Bulbul_CAVW2017.pdf}, \neprint =  {https://onlinelibrary.wiley.com/doi/pdf/10.1002/cav.1742}, \nabstract =  {We propose to automatically populate geo‐located virtual cities by harvesting and \nanalyzing online contents shared on social networks and websites. We show how pose and motion paths \nof agents can be realistically rendered using information gathered from social media. 3D cities are \nautomatically generated using open‐source information available online. To provide our final rendering \nof both static and dynamic urban scenes, we use Unreal game engine.}}\n
\n
\n\n\n
\n We propose to automatically populate geo‐located virtual cities by harvesting and analyzing online contents shared on social networks and websites. We show how pose and motion paths of agents can be realistically rendered using information gathered from social media. 3D cities are automatically generated using open‐source information available online. To provide our final rendering of both static and dynamic urban scenes, we use Unreal game engine.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust Registration of Gaussian Mixtures for Colour Transfer.\n \n \n \n \n\n\n \n Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n Technical Report Trinity College Dublin Ireland, 2017.\n \n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{DBLP:journals/corr/GroganD17, \nauthor =  {Mair{\\'{e}}ad Grogan and Rozenn Dahyot}, \ntitle =  {Robust Registration of Gaussian Mixtures for Colour Transfer},\ninstitution =  {Trinity College Dublin Ireland},\nabstract = {We present a flexible approach to colour transfer inspired by techniques \nrecently proposed for shape registration. Colour distributions of the palette and target \nimages are modelled with Gaussian Mixture Models (GMMs) that are robustly registered to \ninfer a non linear parametric transfer function. We show experimentally that our approach compares \nwell to current techniques both quantitatively and qualitatively. Moreover, our technique is \ncomputationally the fastest and can take efficient advantage of parallel processing architectures \nfor recolouring images and videos. Our transfer function is parametric and hence can be stored in memory \nfor later usage and also combined with other computed transfer functions to create interesting visual effects. \nOverall this paper provides a fast user friendly approach to recolouring of image and video materials.},\nvolume =  {abs/1705.06091}, \nyear =  {2017}, \nurl =  {https://arxiv.org/pdf/1705.06091.pdf}, \narchivePrefix =  {arXiv}, \neprint =  {1705.06091},\ntimestamp =  {Wed, 07 Jun 2017 14:41:30 +0200} }\n\n
\n
\n\n\n
\n We present a flexible approach to colour transfer inspired by techniques recently proposed for shape registration. Colour distributions of the palette and target images are modelled with Gaussian Mixture Models (GMMs) that are robustly registered to infer a non linear parametric transfer function. We show experimentally that our approach compares well to current techniques both quantitatively and qualitatively. Moreover, our technique is computationally the fastest and can take efficient advantage of parallel processing architectures for recolouring images and videos. Our transfer function is parametric and hence can be stored in memory for later usage and also combined with other computed transfer functions to create interesting visual effects. Overall this paper provides a fast user friendly approach to recolouring of image and video materials.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Social media based 3D visual popularity .\n \n \n \n \n\n\n \n Bulbul, A.; and Dahyot, R.\n\n\n \n\n\n\n Computers & Graphics , 63: 28 - 36. 2017.\n \n\n\n\n
\n\n\n\n \n \n \"SocialPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{BulbulCAG2016,\ntitle =  {Social media based 3D visual popularity },\njournal =  {Computers \\& Graphics }, \nvolume =  {63}, \nnumber =  {},\npages =  {28 - 36}, \nyear =  {2017}, \nnote =  {}, \nissn =  {0097-8493},\ndoi =  {10.1016/j.cag.2017.01.005}, \nurl =  {https://mural.maynoothuniversity.ie/15107/1/RD_social.pdf}, \nauthor =  {Abdullah Bulbul and Rozenn Dahyot}, \nkeywords =  {3D cities },\nabstract =  {This paper proposes to use a geotagged virtual world for the visualization of people’s visual interest\nand their sentiment as captured from their social network activities. Using mobile devices, people widely \nshare their experiences and the things they find interesting through social networks. We experimentally show\nthat accumulating information over a period of time from multiple social network users allows to efficiently map\nand visualize popular landmarks as found in cities such as Rome in Italy and Dublin in Ireland. The proposed approach \nis also sensitive to temporal and spatial events that attract visual attention. We visualize the calculated popularity on\n3D virtual cities using game engine technologies. }\n}
\n
\n\n\n
\n This paper proposes to use a geotagged virtual world for the visualization of people’s visual interest and their sentiment as captured from their social network activities. Using mobile devices, people widely share their experiences and the things they find interesting through social networks. We experimentally show that accumulating information over a period of time from multiple social network users allows to efficiently map and visualize popular landmarks as found in cities such as Rome in Italy and Dublin in Ireland. The proposed approach is also sensitive to temporal and spatial events that attract visual attention. We visualize the calculated popularity on 3D virtual cities using game engine technologies. \n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Trinity College Dublin Drone Survey Dataset.\n \n \n \n \n\n\n \n Byrne, J.; Connelly, J.; Su, J.; Krylov, V.; Bourke, M.; Moloney, D.; and Dahyot, R.\n\n\n \n\n\n\n Technical Report School of Computer Science and Statistics, Trinity College Dublin, 2017.\n Tech report, mesh and Drone Images available URI: http://hdl.handle.net/2262/81836\n\n\n\n
\n\n\n\n \n \n \"TrinityPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{Drone2017,\ntitle =  {Trinity College Dublin Drone Survey Dataset}, \nauthor =  {J. Byrne and J. Connelly and  J. Su and V. Krylov and M. Bourke and D. Moloney and R. Dahyot},\ninstitution =  {School of Computer Science and Statistics, Trinity College Dublin}, \nyear =  {2017},\nnote = {Tech report, mesh  and Drone Images available URI: http://hdl.handle.net/2262/81836},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/81836/tcd3dintelmovidius2017-drone-imagery%5b2%5d.pdf}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n User Interaction for Image Recolouring Using L2.\n \n \n \n \n\n\n \n Grogan, M.; Dahyot, R.; and Smolic, A.\n\n\n \n\n\n\n In Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017), of CVMP 2017, pages 6:1–6:10, New York, NY, USA, 2017. ACM\n Awarded best paper CVMP 2017\n\n\n\n
\n\n\n\n \n \n \"UserPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{Grogan:2017:UII:3150165.3150171, \nauthor =  {Grogan, Mair{\\'e}ad and Dahyot, Rozenn and Smolic, Aljosa},\ntitle =  {User Interaction for Image Recolouring Using L2}, \nbooktitle =  {Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)}, \nseries =  {CVMP 2017}, \nyear =  {2017}, \nisbn =  {978-1-4503-5329-8},\nlocation =  {London, United Kingdom}, \npages =  {6:1--6:10}, \narticleno =  {6}, \nnumpages =  {10},\nnote =  {Awarded best paper CVMP 2017}, \nurl =  {http://doi.acm.org/10.1145/3150165.3150171}, \ndoi =  {10.1145/3150165.3150171},\nabstract = {Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image\nusing Gaussian Mixture Models, and registers them by minimising the robust £2 distance between the mixtures. \nIn this paper we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications,\nthe first allowing the user to select colour correspondences between a target and palette image, while the second palette based application \nallows the user to edit a palette of colours to determine the image recolouring. We modify the £2 based cost function to improve \nresults when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, \ngood colour transfer results are created. \nBoth applications are available through a web interface and qualitatively assessed against recent recolouring techniques.},\nacmid =  {3150171}, \npublisher =  {ACM}, \naddress =  {New York, NY, USA}, \nkeywords =  {L2 Registration, Colour transfer, palette based image recoloring}}\n
\n
\n\n\n
\n Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image using Gaussian Mixture Models, and registers them by minimising the robust £2 distance between the mixtures. In this paper we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications, the first allowing the user to select colour correspondences between a target and palette image, while the second palette based application allows the user to edit a palette of colours to determine the image recolouring. We modify the £2 based cost function to improve results when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, good colour transfer results are created. Both applications are available through a web interface and qualitatively assessed against recent recolouring techniques.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2016\n \n \n (3)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Deep Shape from a Low Number of Silhouettes.\n \n \n \n \n\n\n \n Di, X.; Dahyot, R.; and Prasad, M.\n\n\n \n\n\n\n In Hua, G.; and Jégou, H., editor(s), Computer Vision – ECCV 2016 Workshops, pages 251–265, Cham, 2016. Springer International Publishing\n \n\n\n\n
\n\n\n\n \n \n \"DeepPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@InProceedings{DiECCV2016, \nauthor =  {Di, Xinhan and Dahyot, Rozenn and Prasad, Mukta}, \neditor =  {Hua, Gang and J{\\'e}gou, Herv{\\'e}}, \ntitle =  {Deep Shape from a Low Number of Silhouettes},\nbooktitle =  {Computer Vision -- ECCV 2016 Workshops}, \nyear =  {2016}, \npublisher =  {Springer International Publishing},\naddress =  {Cham}, \npages =  {251--265},\nabstract =  {Despite strong progress in the field of 3D reconstruction from multiple views, \nholes on objects, transparency of objects and textureless scenes, continue to be open challenges. \nOn the other hand, silhouette based reconstruction techniques ease the dependency of 3d reconstruction on image pixels \nbut need a large number of silhouettes to be available from multiple views. In this paper, a novel end to end pipeline\nis proposed to produce high quality reconstruction from a low number of silhouettes, \nthe core of which is a deep shape reconstruction architecture. Evaluations on ShapeNet [1] show good quality \nof reconstruction compared with ground truth.}, \nisbn =  {978-3-319-49409-8}, \nurl =  {https://mural.maynoothuniversity.ie/15258/1/RD_deep%20shape.pdf}, \ndoi =  {10.1007/978-3-319-49409-8}}\n\n
\n
\n\n\n
\n Despite strong progress in the field of 3D reconstruction from multiple views, holes on objects, transparency of objects and textureless scenes, continue to be open challenges. On the other hand, silhouette based reconstruction techniques ease the dependency of 3d reconstruction on image pixels but need a large number of silhouettes to be available from multiple views. In this paper, a novel end to end pipeline is proposed to produce high quality reconstruction from a low number of silhouettes, the core of which is a deep shape reconstruction architecture. Evaluations on ShapeNet [1] show good quality of reconstruction compared with ground truth.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Recent techniques for (re)colouring.\n \n \n \n \n\n\n \n Grogan, M.; Carvalho, J.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing Conference (IMVIP 2016), Galway, Ireland, August 2016. \n URI http://hdl.handle.net/10379/6136\n\n\n\n
\n\n\n\n \n \n \"RecentPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{GroganIMVIP2016, \ntitle =  {Recent techniques for (re)colouring}, \nauthor =  {M. Grogan and J. Carvalho and R. Dahyot},\nbooktitle =  {Irish Machine Vision and Image Processing Conference (IMVIP 2016)},\nmonth =  {August}, \nabstract = {This paper investigates how several techniques can be used together for colouring frames in grey level sequences.\nA trained deep neural network is used to colour a grey level image coherently ,\nand this colour image can be recoloured further to change its feel. When considering\nvideos however, artifacts are created in the first step when the same semantic object can occasionally\nbe given different colours from frame to frame in the sequence creating a flicker in the resulting coloured\nsequence.},\nkeywords = {colour transfer, colouring, deep learning, flicker},\naddress =  {Galway, Ireland}, \nyear =  {2016}, \nnote = {URI http://hdl.handle.net/10379/6136},\nurl =  {https://aran.library.nuigalway.ie/bitstream/handle/10379/6136/IMVIP2016Book.pdf}}\n\n
\n
\n\n\n
\n This paper investigates how several techniques can be used together for colouring frames in grey level sequences. A trained deep neural network is used to colour a grey level image coherently , and this colour image can be recoloured further to change its feel. When considering videos however, artifacts are created in the first step when the same semantic object can occasionally be given different colours from frame to frame in the sequence creating a flicker in the resulting coloured sequence.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust ellipse detection with Gaussian mixture models.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n Pattern Recognition, 58: 12 - 26. 2016.\n Github https://github.com/clarella/L2-Ellipse-Fitting\n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 3 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{ArellanoPR2015, \ntitle =  {Robust ellipse detection with Gaussian mixture models}, \njournal =  {Pattern Recognition}, \nvolume =  {58},\npages =  {12 - 26},\nyear =  {2016}, \nissn =  {0031-3203}, \ndoi =  {10.1016/j.patcog.2016.01.017}, \nabstract = {The Euclidian distance between Gaussian Mixtures has been shown to be robust to perform point set registration (Jian and Vemuri, 2011). \nWe propose to extend this idea for robustly matching a family of shapes (ellipses). Optimisation is performed with an annealing strategy, \nand the search for occurrences is repeated several times to detect multiple instances of the shape of interest. We compare experimentally our approach\nto other state-of-the-art techniques on a benchmark database for ellipses, and demonstrate the good performance of our approach.},\nurl =  {https://mural.maynoothuniversity.ie/15108/1/D_robust.pdf},\nnote = {Github https://github.com/clarella/L2-Ellipse-Fitting},\nauthor =  {Claudia Arellano and Rozenn Dahyot}, \nkeywords =  {Ellipse detection, L2 distance, GMM, Parameter estimation}}\n\n
\n
\n\n\n
\n The Euclidian distance between Gaussian Mixtures has been shown to be robust to perform point set registration (Jian and Vemuri, 2011). We propose to extend this idea for robustly matching a family of shapes (ellipses). Optimisation is performed with an annealing strategy, and the search for occurrences is repeated several times to detect multiple instances of the shape of interest. We compare experimentally our approach to other state-of-the-art techniques on a benchmark database for ellipses, and demonstrate the good performance of our approach.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2015\n \n \n (7)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n 3D Reconstruction of Reflective Spherical Surfaces from Multiple Images.\n \n \n \n \n\n\n \n Bulbul, A.; Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing Conference (IMVIP 2015), pages 19-26, Dublin, Ireland, August 2015. \n URI http://hdl.handle.net/2262/74714\n\n\n\n
\n\n\n\n \n \n \"3DPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{BulbulIMVIP2015,\ntitle =  {3D Reconstruction of Reflective Spherical Surfaces from Multiple Images}, \nauthor =  {A. Bulbul and M. Grogan and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing Conference (IMVIP 2015)}, \nmonth =  {August}, \nyear =  {2015},\naddress =  {Dublin, Ireland},\npages = {19-26},\nabstract = {Despite the recent advances in 3D reconstruction from images, the state of the art methods fail to ac-\ncurately reconstruct objects with reflective materials. The underlying reason for this inaccuracy is that the\ndetected image features belong to the reflected scene instead of the reconstructed object and do not lie on\nthe surface of the object. In this study, we propose a method to refine the 3D reconstruction of reflective\nconvex surfaces. This method utilizes the geometrical distortion of the reflected scenes behind a spherical\nsurface.},\nkeywords = {3D reconstruction, Shape from images, Hough Transform, Specular surface},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/74714/IMVIP2015Book.pdf},\nnote =  {URI http://hdl.handle.net/2262/74714}}\n
\n
\n\n\n
\n Despite the recent advances in 3D reconstruction from images, the state of the art methods fail to ac- curately reconstruct objects with reflective materials. The underlying reason for this inaccuracy is that the detected image features belong to the reflected scene instead of the reconstructed object and do not lie on the surface of the object. In this study, we propose a method to refine the 3D reconstruction of reflective convex surfaces. This method utilizes the geometrical distortion of the reflected scenes behind a spherical surface.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Hand Hygiene Poses Recognition with RGB-D Videos.\n \n \n \n \n\n\n \n Xia, B.; Dahyot, R.; Ruttle, J.; Caulfield, D.; and Lacey, G.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing Conference (IMVIP 2015), pages 43-50, Dublin, Ireland, August 2015. \n URI http://hdl.handle.net/2262/74714\n\n\n\n
\n\n\n\n \n \n \"HandPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 3 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{XiaIMVIP2015, \ntitle =  {Hand Hygiene Poses Recognition with RGB-D Videos}, \nauthor =  {B. Xia and R. Dahyot and J. Ruttle and D. Caulfield and G. Lacey}, \nbooktitle =  {Irish Machine Vision and Image Processing Conference (IMVIP 2015)}, \npages = {43-50},\nabstract = {Hand hygiene is the most effective way in preventing the health care-associated infection. In this work,\nwe propose to investigate the automatic recognition of the hand hygiene poses with RGB-D videos. Different\nclassifiers are experimented with the Histogram of Oriented Gradient (HOG) features extracted from\nthe hand regions. With a frame-level classification rate of more than 95\\%, and with 100\\% video-level classification\nrate, we demonstrate the effectiveness of our method for recognizing these hand hygiene poses.\nAlso, we demonstrate that using the temporal information, and combining the color with depth information\ncan improve the recognition accuracy.},\nkeywords = {Hand Hygiene, Poses Recognition, RGB-D},\nmonth =  {August},\naddress =  {Dublin, Ireland},\nyear =  {2015},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/74714/IMVIP2015Book.pdf},\nnote =  {URI http://hdl.handle.net/2262/74714}}\n
\n
\n\n\n
\n Hand hygiene is the most effective way in preventing the health care-associated infection. In this work, we propose to investigate the automatic recognition of the hand hygiene poses with RGB-D videos. Different classifiers are experimented with the Histogram of Oriented Gradient (HOG) features extracted from the hand regions. With a frame-level classification rate of more than 95%, and with 100% video-level classification rate, we demonstrate the effectiveness of our method for recognizing these hand hygiene poses. Also, we demonstrate that using the temporal information, and combining the color with depth information can improve the recognition accuracy.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Information visualisation for social media analytics.\n \n \n \n \n\n\n \n Dahyot, R.; Brady, C.; Bourges, C.; and Bulbul, A.\n\n\n \n\n\n\n In 2015 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), Prague, Czech Republic, 29-30 October 2015. \n \n\n\n\n
\n\n\n\n \n \n \"InformationPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@inproceedings{DahyotIWCIM2015, \ntitle =  {Information visualisation for social media analytics},\nauthor =  {R. Dahyot and C. Brady and C. Bourges and A. Bulbul}, \nbooktitle =  {2015 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM)}, \nkeywords =  {Global Positioning System;data visualisation;rendering (computer graphics);social networking (online);GPS;audio visual rendering;dataset visualization;geolocated datasets;information visualisation;location information;sentiment extraction;social media analytics;social networks;timestamp;Geology;Google;Heating;Media;Silicon;Visualization;Social Media Analytics;Visualisation}, \naddress =  {Prague, Czech Republic}, \nmonth =  {29-30 October}, \nyear =  {2015}, \nabstract = {This paper tackles the audio visual renderings of geolocated datasets harvested from social networks. \nThese datasets are noisy, multimodal and heterogeneous by nature, providing different fields of information. \nWe focus here on the information of location (GPS), time (timestamp) and text from tweets from which sentiment is extracted. \nWe provide two ways for visualising datasets and for which demos can be seen online.},\nurl =  {https://mural.maynoothuniversity.ie/15259/1/RD_information.pdf}, \ndoi =  {10.1109/IWCIM.2015.7347082}}\n\n
\n
\n\n\n
\n This paper tackles the audio visual renderings of geolocated datasets harvested from social networks. These datasets are noisy, multimodal and heterogeneous by nature, providing different fields of information. We focus here on the information of location (GPS), time (timestamp) and text from tweets from which sentiment is extracted. We provide two ways for visualising datasets and for which demos can be seen online.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n L2 Registration for Colour Transfer.\n \n \n \n \n\n\n \n Grogan, M.; Prasad, M.; and Dahyot, R.\n\n\n \n\n\n\n In European Signal Processing Conference (Eusipco), Nice France, September 2015. \n https://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570102575.pdf\n\n\n\n
\n\n\n\n \n \n \"L2Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{GroganEusipco2015, \ntitle =  {L2 Registration for Colour Transfer}, \nauthor =  {M. Grogan and M. Prasad and R. Dahyot}, \nbooktitle =  {European Signal Processing Conference (Eusipco)}, \naddress =  {Nice France}, \nmonth =  {September}, \nyear =  {2015}, \nabstract = {This paper proposes to perform colour transfer by minimising a divergence (the L2 distance) between two colour distributions. \nWe propose to model each dataset by a compact\nGaussian mixture which is designed for the specific purpose\nof colour transfer between images which have different scene\ncontent. A non rigid transformation is estimated by minimising the Euclidean distance (L2) between these two distributions, and \nthe estimated transformation is used for transferring colour statistics from one image to another. Experimental\nresults show that this is a very promising approach for transferring colour and it performs very well against an alternative\nreference approach.},\nnote={https://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570102575.pdf},\ndoi =  {10.1109/EUSIPCO.2015.7362799},\nurl =  {https://mural.maynoothuniversity.ie/15261/1/RD_L2%20registration.pdf}\n}\n\n
\n
\n\n\n
\n This paper proposes to perform colour transfer by minimising a divergence (the L2 distance) between two colour distributions. We propose to model each dataset by a compact Gaussian mixture which is designed for the specific purpose of colour transfer between images which have different scene content. A non rigid transformation is estimated by minimising the Euclidean distance (L2) between these two distributions, and the estimated transformation is used for transferring colour statistics from one image to another. Experimental results show that this is a very promising approach for transferring colour and it performs very well against an alternative reference approach.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n IRISH MACHINE VISION and IMAGE PROCESSING Conference proceedings 2015.\n \n \n \n \n\n\n \n Dahyot, R.; Lacey, G.; Dawson-Howe, K.; Pitie, F.; and Moloney, D.,\n editors.\n \n\n\n \n\n\n\n Irish Pattern Recognition and Classification Society (ISBN 978-0-9934207-0-2). Dublin, Ireland, August 2015.\n URI: http://hdl.handle.net/2262/74714\n\n\n\n
\n\n\n\n \n \n \"IRISHPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@proceedings{IMVIP2015,\neditor  = {Rozenn Dahyot and Gerard Lacey and Kenneth Dawson-Howe and  Francois Pitie and David Moloney},\ntitle = {IRISH MACHINE VISION and IMAGE PROCESSING Conference proceedings 2015},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/74714/IMVIP2015Book.pdf},\nabstract = {},\nPublisher = {Irish Pattern Recognition and Classification Society (ISBN 978-0-9934207-0-2)},\nnote = {URI: http://hdl.handle.net/2262/74714},\naddress = {Dublin, Ireland},\nmonth = {August},\nyear = {2015}}\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n L2 Registration for Colour Transfer in Videos.\n \n \n \n \n\n\n \n Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n In Proceedings of the 12th European Conference on Visual Media Production, of CVMP '15, pages 16:1–16:1, New York, NY, USA, 2015. ACM\n Awarded Best Student Poster at CVMP 2015 \n\n\n\n
\n\n\n\n \n \n \"L2Paper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{GroganCVMP2015, \nauthor =  {Grogan, Mairead and Dahyot, Rozenn}, \ntitle =  {L2 Registration for Colour Transfer in Videos}, \nbooktitle =  {Proceedings of the 12th European Conference on Visual Media Production}, \nseries =  {CVMP '15}, \nyear =  {2015}, \nisbn =  {978-1-4503-3560-7}, \nlocation =  {London, United Kingdom}, \npages =  {16:1--16:1}, \narticleno =  {16}, \nnumpages =  {1},\nurl =  {http://doi.acm.org/10.1145/2824840.2824862}, \nnote = {Awarded Best Student Poster at CVMP 2015 },\nacmid =  {2824862}, \npublisher =  {ACM},\naddress =  {New York, NY, USA}, \nabstract = {We propose a method for colour transfer by minimising the L2 distance between two colour distributions.\nWe use Gaussian Mixture Models (GMMs) to model the colour distribution of the target and palette images and use L2 to find a transformation φ \nwhich register the GMM's. The L2 distance has been shown to be robust for shape registration application [2]. The function φ is modelled as either \nan affine or Thin Plate Spline transformation controlled by a latent vector θ.\nThe affine function consists of a 3x3 matrix A and 3D vector offset o.},\ndoi =  {10.1145/2824840.2824862}}\n\n
\n
\n\n\n
\n We propose a method for colour transfer by minimising the L2 distance between two colour distributions. We use Gaussian Mixture Models (GMMs) to model the colour distribution of the target and palette images and use L2 to find a transformation φ which register the GMM's. The L2 distance has been shown to be robust for shape registration application [2]. The function φ is modelled as either an affine or Thin Plate Spline transformation controlled by a latent vector θ. The affine function consists of a 3x3 matrix A and 3D vector offset o.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Social Media Based 3D Modeling and Visualization.\n \n \n \n \n\n\n \n Bulbul, A.; and Dahyot, R.\n\n\n \n\n\n\n In Proceedings of the 12th European Conference on Visual Media Production, of CVMP '15, pages 20:1–20:1, New York, NY, USA, 2015. ACM\n \n\n\n\n
\n\n\n\n \n \n \"SocialPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{BulbulCVMP2015, \nauthor =  {Bulbul, Abdullah and Dahyot, Rozenn}, \ntitle =  {Social Media Based 3D Modeling and Visualization}, \nbooktitle =  {Proceedings of the 12th European Conference on Visual Media Production}, \nseries =  {CVMP '15}, \nyear =  {2015},\nisbn =  {978-1-4503-3560-7},\nlocation =  {London, United Kingdom}, \npages =  {20:1--20:1}, \narticleno =  {20}, \nnumpages =  {1},\nurl =  {http://doi.acm.org/10.1145/2824840.2824860}, \ndoi =  {10.1145/2824840.2824860},\nabstract = {Social Media is a very rich source of up-to-date localized information. \nIn recent years, image collections from photo sharing websites (e.g. Flicker) have been effectively used for 3D reconstruction of objects,\nbuildings and even cities. While 3D reconstruction techniques are highly improved in terms of accuracy, performance, and parallelism \nthere are still means to utilize the up-to-date information available from public social sharing websites such as Twitter and\nInstagram for continuous refinement of the 3D models and information visualization. Our emphasis is on utilizing the information for detecting \nand refining the changes in the scene, \nadding new structures and visualizing saliency/popularity information in 3D.},\nacmid =  {2824860}, \npublisher =  {ACM}, \naddress =  {New York, NY, USA}}\n\n
\n
\n\n\n
\n Social Media is a very rich source of up-to-date localized information. In recent years, image collections from photo sharing websites (e.g. Flicker) have been effectively used for 3D reconstruction of objects, buildings and even cities. While 3D reconstruction techniques are highly improved in terms of accuracy, performance, and parallelism there are still means to utilize the up-to-date information available from public social sharing websites such as Twitter and Instagram for continuous refinement of the 3D models and information visualization. Our emphasis is on utilizing the information for detecting and refining the changes in the scene, adding new structures and visualizing saliency/popularity information in 3D.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2014\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n An Architecture for Social Media Summarisation.\n \n \n \n \n\n\n \n Zdziarski, Z.; Mitchell, J.; Houdyer, P.; Johnson, D.; Bourges, C.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing Conference (IMVIP 2014), pages 187-188, Derry-Londonderry, Northern Ireland, 27-29 August 2014. \n URI http://hdl.handle.net/2262/71411\n\n\n\n
\n\n\n\n \n \n \"AnPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{graisearchIMVIP2014, \ntitle =  {An Architecture for Social Media Summarisation},\nauthor =  {Z. Zdziarski and J. Mitchell and P. Houdyer and D. Johnson and C. Bourges and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing Conference (IMVIP 2014)},\naddress =  {Derry-Londonderry, Northern Ireland},\npages =  {187-188}, \nyear =  {2014}, \nmonth =  {27-29 August},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/71411/IMVIP2014_Proceedings.pdf},\nnote =  {URI http://hdl.handle.net/2262/71411}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Extension of GBVS to 3D media.\n \n \n \n \n\n\n \n Zdziarski, Z.; and Dahyot, R.\n\n\n \n\n\n\n In Signal Processing and Communications Applications Conference (SIU), 2014 22nd, pages 2296-2300, April 2014. \n \n\n\n\n
\n\n\n\n \n \n \"ExtensionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{Zdziarski:SIU:2014, \nauthor =  {Zdziarski, Z. and Dahyot, R.},\nbooktitle =  {Signal Processing and Communications Applications Conference (SIU), 2014 22nd},\ntitle =  {Extension of GBVS to 3D media},\nyear =  {2014}, \nmonth =  {April}, \npages =  {2296-2300}, \nkeywords =  {computer vision;object tracking;2D displays;3D media;GBVS algorithm;GBVS extension;eye tracking technologies;graph-based visual saliency algorithm;Algorithm design and analysis;Computational modeling;Conferences;Signal processing algorithms;Stereo image processing;Three-dimensional displays;Visualization;3D media;Visual saliency}, \nabstract = {Visual saliency has been studied extensively in the past decades through perceptual studies using eye tracking technologies\nand 2D displays. Visual saliency algorithms have been successfully developed to mimick the human ability to quickly spot informative local areas in images. \nThis paper proposes to investigate the extension of visual saliency algorithms to media displayed in 3D. We show first that the Graph-Based Visual \nSaliency (GBVS) algorithm outperforms all the other common 2D algorithms as well as their 3D extensions.\nThis paper then extends GBVS to 3D and shows that these new 3D GBVS based algorithms outperform other past algorithms.},\nurl={https://mural.maynoothuniversity.ie/15265/1/RD_extension.pdf},\ndoi =  {10.1109/SIU.2014.6830723}}\n\n
\n
\n\n\n
\n Visual saliency has been studied extensively in the past decades through perceptual studies using eye tracking technologies and 2D displays. Visual saliency algorithms have been successfully developed to mimick the human ability to quickly spot informative local areas in images. This paper proposes to investigate the extension of visual saliency algorithms to media displayed in 3D. We show first that the Graph-Based Visual Saliency (GBVS) algorithm outperforms all the other common 2D algorithms as well as their 3D extensions. This paper then extends GBVS to 3D and shows that these new 3D GBVS based algorithms outperform other past algorithms.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n GR2T vs L2E with Nuisance Scale.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n In 2014 22nd International Conference on Pattern Recognition, pages 3857-3861, Aug 2014. \n \n\n\n\n
\n\n\n\n \n \n \"GR2TPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@inproceedings{ICPR2014Dahyot, \nauthor =  {R. Dahyot}, \nbooktitle =  {2014 22nd International Conference on Pattern Recognition}, \ntitle =  {GR2T vs L2E with Nuisance Scale}, \nyear =  {2014}, \nvolume =  {}, \nurl = {https://mural.maynoothuniversity.ie/15266/1/RD_GR2t.pdf},\nnumber =  {}, \npages =  {3857-3861}, \nkeywords =  {Radon transforms;regression analysis;GR2T;L2E;generalized relaxed Radon transform;nuisance scale;regression analysis;robust parameter estimation;scale parameter;Equations;Estimation;Mathematical model;Noise;Pattern recognition;Robustness;Transforms},\ndoi =  {10.1109/ICPR.2014.662},\nabstract = {We compare the objective functions used by GR2T and the L2E estimator  \nthat have both been proposed for robust parameter estimation. We show their similarity when estimating location parameters.\nOf particular interest is their ability for dealing with the scale parameter that is often unknown and acts as a nuisance parameter. \nBoth techniques are tested experimentally for regression (e.g. to find patterns such as line \nand circle in noisy datasets) and for registration between datasets.}, \nISSN =  {1051-4651}, \nmonth =  {Aug}}\n\n
\n
\n\n\n
\n We compare the objective functions used by GR2T and the L2E estimator that have both been proposed for robust parameter estimation. We show their similarity when estimating location parameters. Of particular interest is their ability for dealing with the scale parameter that is often unknown and acts as a nuisance parameter. Both techniques are tested experimentally for regression (e.g. to find patterns such as line and circle in noisy datasets) and for registration between datasets.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Mesh from Depth Images Using GR2T.\n \n \n \n \n\n\n \n Grogan, M.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing Conference, pages 15-20, Derry-Londonderry, Northern Ireland, 27-29 August 2014. \n URI: http://hdl.handle.net/2262/71411\n\n\n\n
\n\n\n\n \n \n \"MeshPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{GroganIMVIP2014, \ntitle =  {Mesh from Depth Images Using GR2T}, \nauthor =  {M. Grogan and R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing Conference}, \naddress =  {Derry-Londonderry, Northern Ireland}, \nabstract = {This paper proposes an algorithm for inferring a 3D mesh using the robust cost function\nproposed by Ruttle et al. Our contribution is in proposing a new algorithm for\ninference that is very suitable for parallel architecture. The cost function also provides\na goodness of fit for each element of the mesh which is correlated to the distance to the\nground truth, hence providing informative feedback to users.},\nkeywords = {3D reconstruction, Depth images, Generalised Relaxed Radon Transform},\npages =  {15-20}, \nmonth =  {27-29 August},\nyear =  {2014},\nnote = {URI: http://hdl.handle.net/2262/71411},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/71411/IMVIP2014_Proceedings.pdf}}\n\n
\n
\n\n\n
\n This paper proposes an algorithm for inferring a 3D mesh using the robust cost function proposed by Ruttle et al. Our contribution is in proposing a new algorithm for inference that is very suitable for parallel architecture. The cost function also provides a goodness of fit for each element of the mesh which is correlated to the distance to the ground truth, hence providing informative feedback to users.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n On summarising the 'here and now' of social videos for smart mobile browsing.\n \n \n \n \n\n\n \n Zdziarski, Z.; Bourgès, C.; Mitchell, J.; Houdyer, P.; Johnson, D.; and Dahyot, R.\n\n\n \n\n\n\n In 2014 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), Paris, France, 1-2 Nov. 2014. \n \n\n\n\n
\n\n\n\n \n \n \"OnPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@inproceedings{graisearchCIMU2014, \ntitle =  {On summarising the 'here and now' of social videos for smart mobile browsing}, \nabstract = {The amount of media that is being uploaded to social sites (such as Twitter, Facebook and Instagram)\nis providing a wealth of visual data (images and videos) augmented with additional information such as keywords, timestamps and GPS coordinates. \nTapastreet  provides access in real-time to this visual content by harvesting social networks for visual media associated with particular locations, \ntime and hashtags [1]. Browsing efficiently through harvested videos requires smart processing to give users a quick overview of their \ncontent in particular when using mobile platforms with limited bandwidth. \nThis paper aims at presenting an architecture for testing several strategies for processing summaries of videos collected on social \nnetworks to tackle this issue.},\naddress =  {Paris, France}, \nmonth =  {1-2 Nov.}, \nyear =  {2014}, \nauthor =  {Z. Zdziarski and C. Bourgès and J. Mitchell and P. Houdyer and D. Johnson and R. Dahyot},\nbooktitle =  {2014 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM)}, \nkeywords =  {mobile computing;social networking (online);video retrieval;video signal processing;Tapastreet;mobile platforms;smart mobile browsing;smart processing;social networks;social sites;social video summarisation;visual content;visual data;visual media;Media;Pipelines;Social network services;Streaming media;Transform coding;Videos;Visualization;Blur Detection;MPEG Codec;Social Media;Video Summarisation;Web Harvesting},\ndoi =  {10.1109/IWCIM.2014.7008797}, \nurl={https://mural.maynoothuniversity.ie/15268/1/RD_on%20summarising.pdf},\nISSN =  {}}\n
\n
\n\n\n
\n The amount of media that is being uploaded to social sites (such as Twitter, Facebook and Instagram) is providing a wealth of visual data (images and videos) augmented with additional information such as keywords, timestamps and GPS coordinates. Tapastreet provides access in real-time to this visual content by harvesting social networks for visual media associated with particular locations, time and hashtags [1]. Browsing efficiently through harvested videos requires smart processing to give users a quick overview of their content in particular when using mobile platforms with limited bandwidth. This paper aims at presenting an architecture for testing several strategies for processing summaries of videos collected on social networks to tackle this issue.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust shape from depth images with GR2T.\n \n \n \n \n\n\n \n Ruttle, J.; Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n Pattern Recognition Letters, 50: 43 - 54. 2014.\n URI: http://hdl.handle.net/2262/68177\n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{RuttlePRL2014, \ntitle =  {Robust shape from depth images with GR2T}, \njournal =  {Pattern Recognition Letters}, \nvolume =  {50}, \npages =  {43 - 54}, \nyear =  {2014}, \nnote =  {Depth Image Analysis}, \nissn =  {0167-8655},\ndoi =  {10.1016/j.patrec.2014.01.016}, \nabstract = {This paper proposes to infer accurately a 3D shape of an object captured by a depth camera from multiple view points. \nThe Generalised Relaxed Radon Transform (GR2T) [1] is used here to merge all depth images in a robust kernel density estimate \nthat models the surface of an object in the 3D space. The kernel is tailored to capture the uncertainty associated with each pixel \nin the depth images. The resulting cost function is suitable for stochastic exploration with gradient ascent algorithms when the \nnoise of the observations is modelled with a differentiable distribution. When merging several depth images captured from several view points, \nextrinsic camera parameters need to be known accurately, and we extend GR2T to also estimate these nuisance parameters. \nWe illustrate qualitatively the performance of our modelling and we assess quantitatively\nthe accuracy of our 3D shape reconstructions computed from depth images captured with a Kinect camera.},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/68177/Ruttle%2D%2DRobust%20shape%20from%20de.pdf},\nnote = {URI: http://hdl.handle.net/2262/68177},\nauthor =  {Jonathan Ruttle and Claudia Arellano and Rozenn Dahyot}, \nkeywords =  {Shape from depth, Generalised Relaxed Radon Transform (GRT), Noise modelling}}\n\n
\n
\n\n\n
\n This paper proposes to infer accurately a 3D shape of an object captured by a depth camera from multiple view points. The Generalised Relaxed Radon Transform (GR2T) [1] is used here to merge all depth images in a robust kernel density estimate that models the surface of an object in the 3D space. The kernel is tailored to capture the uncertainty associated with each pixel in the depth images. The resulting cost function is suitable for stochastic exploration with gradient ascent algorithms when the noise of the observations is modelled with a differentiable distribution. When merging several depth images captured from several view points, extrinsic camera parameters need to be known accurately, and we extend GR2T to also estimate these nuisance parameters. We illustrate qualitatively the performance of our modelling and we assess quantitatively the accuracy of our 3D shape reconstructions computed from depth images captured with a Kinect camera.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2013\n \n \n (4)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Bayesian 3D shape from silhouettes.\n \n \n \n \n\n\n \n Kim, D.; Ruttle, J.; and Dahyot, R.\n\n\n \n\n\n\n Digital Signal Processing, 23(6): 1844 - 1855. 2013.\n \n\n\n\n
\n\n\n\n \n \n \"BayesianPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{DSP2013Dahyot,\ntitle =  {Bayesian 3D shape from silhouettes},\njournal =  {Digital Signal Processing}, \nvolume =  {23}, \nnumber =  {6}, \npages =  {1844 - 1855}, \nyear =  {2013}, \nissn =  {1051-2004}, \ndoi =  {10.1016/j.dsp.2013.06.007}, \nabstract = {This paper introduces a smooth posterior density function for inferring shapes from silhouettes. \nBoth the likelihood and the prior are modelled using kernel density functions and optimisation is performed using gradient ascent algorithms.\nAdding a prior allows for the recovery of concave areas of the shape that are usually lost when estimating the visual hull. \nThis framework is also extended to use colour information when it is available in addition to the silhouettes. \nIn these cases, the modelling not only allows for the shape to be recovered but also its colour information.\nOur new algorithms are assessed by reconstructing 2D shapes from 1D silhouettes and 3D faces from 2D silhouettes.\nExperimental results show that using the prior can assist in reconstructing \nconcave areas and also illustrate the benefits of using colour information even when only small numbers of silhouettes are available.},\nurl =  {https://mural.maynoothuniversity.ie/15118/1/RD_bayesian.pdf}, \nauthor =  {Donghoon Kim and Jonathan Ruttle and Rozenn Dahyot}, \nkeywords =  {3D reconstruction from multiple view images, Shape-from-silhouettes, Kernel density estimates, K-nearest neighbours, Principal component analysis}}\n\n
\n
\n\n\n
\n This paper introduces a smooth posterior density function for inferring shapes from silhouettes. Both the likelihood and the prior are modelled using kernel density functions and optimisation is performed using gradient ascent algorithms. Adding a prior allows for the recovery of concave areas of the shape that are usually lost when estimating the visual hull. This framework is also extended to use colour information when it is available in addition to the silhouettes. In these cases, the modelling not only allows for the shape to be recovered but also its colour information. Our new algorithms are assessed by reconstructing 2D shapes from 1D silhouettes and 3D faces from 2D silhouettes. Experimental results show that using the prior can assist in reconstructing concave areas and also illustrate the benefits of using colour information even when only small numbers of silhouettes are available.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Generalised relaxed Radon transform (GR2T) for robust inference.\n \n \n \n \n\n\n \n Dahyot, R.; and Ruttle, J.\n\n\n \n\n\n\n Pattern Recognition, 46(3): 788 - 794. 2013.\n \n\n\n\n
\n\n\n\n \n \n \"GeneralisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{PR2012Dahyot,\ntitle =  {Generalised relaxed Radon transform (GR2T) for robust inference}, \njournal =  {Pattern Recognition}, \nvolume =  {46}, \nnumber =  {3},\npages =  {788 - 794}, \nyear =  {2013}, \nissn =  {0031-3203}, \nurl = {https://mural.maynoothuniversity.ie/15119/1/RD_generalised.pdf},\ndoi =  {10.1016/j.patcog.2012.09.026}, \nabstract = {This paper introduces the generalised relaxed Radon transform (GR2T) as an extension to the generalised radon transform (GRT) .\nThis new modelling allows us to define a new framework for robust inference. The resulting objective functions are probability density functions \nthat can be chosen differentiable and that can be optimised using gradient methods. One of this cost function is already widely used in the \nforms of the Hough transform and generalised projection based M-estimator, and it is interpreted as a conditional density function on the latent variables\nof interest. In addition the joint density function of the latent variables is also proposed as a cost function and it has the advantage of including\na prior about the latent variable. Several applications, including lines detection in images and volume reconstruction \nfrom silhouettes captured from multiple views, are presented to underline the versatility of this framework.},\nauthor =  {Rozenn Dahyot and Jonathan Ruttle}, \nkeywords =  {Generalised Radon transform, Hough transform, Robust inference, M-estimator, Generalised projection based M-estimator}}\n\n\n
\n
\n\n\n
\n This paper introduces the generalised relaxed Radon transform (GR2T) as an extension to the generalised radon transform (GRT) . This new modelling allows us to define a new framework for robust inference. The resulting objective functions are probability density functions that can be chosen differentiable and that can be optimised using gradient methods. One of this cost function is already widely used in the forms of the Hough transform and generalised projection based M-estimator, and it is interpreted as a conditional density function on the latent variables of interest. In addition the joint density function of the latent variables is also proposed as a cost function and it has the advantage of including a prior about the latent variable. Several applications, including lines detection in images and volume reconstruction from silhouettes captured from multiple views, are presented to underline the versatility of this framework.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n On Creating a 2D & 3D Visual Saliency Dataset.\n \n \n \n \n\n\n \n Zdziarski, Z.; and Dahyot, R.\n\n\n \n\n\n\n In Proceedings of the ACM Symposium on Applied Perception, of SAP '13, pages 132–132, New York, NY, USA, 2013. ACM\n \n\n\n\n
\n\n\n\n \n \n \"OnPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Zdziarski:2013:ACM:SAP, \nauthor =  {Zdziarski, Z. and Dahyot, R.},\ntitle =  {On Creating a 2D \\& 3D Visual Saliency Dataset}, \nbooktitle =  {Proceedings of the ACM Symposium on Applied Perception}, \nseries =  {SAP '13}, \nyear =  {2013}, \nisbn =  {978-1-4503-2262-1}, \nlocation =  {Dublin, Ireland}, \npages =  {132--132}, \nnumpages =  {1}, \nurl =  {http://doi.acm.org/10.1145/2492494.2501889}, \ndoi =  {http://doi.org/10.1145/2492494.2501889}, \nacmid =  {2501889}, \npublisher =  {ACM},\naddress =  {New York, NY, USA}}\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust Bayesian Fitting of 3D Morphable Model.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In Proceedings of the 10th European Conference on Visual Media Production, of CVMP '13, pages 9:1–9:10, New York, NY, USA, 2013. ACM\n http://doi.acm.org/10.1145/2534008.2534013\n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{CVMP2013Arellano, \nauthor =  {C. Arellano and R. Dahyot}, \ntitle =  {Robust Bayesian Fitting of 3D Morphable Model}, \nbooktitle =  {Proceedings of the 10th European Conference on Visual Media Production},\nseries =  {CVMP '13}, \nyear =  {2013}, \nisbn =  {978-1-4503-2589-9}, \nlocation =  {London, United Kingdom}, \npages =  {9:1--9:10}, \narticleno =  {9}, \nnumpages =  {10}, \nnote =  {http://doi.acm.org/10.1145/2534008.2534013},\nurl = {https://roznn.github.io/PDF/RzDCVMP2013.pdf},\ndoi =  {10.1145/2534008.2534013}, \nabstract = {We propose to fit automatically a 3D morphable face model to a point cloud captured with a RGB-D sensor. \nBoth data sets, the shape model and the target point cloud are modelled as two probability density functions (pdfs). \nRigid registration (rotation and translation) and reconstruction on the model is performed by minimising the Euclidean distance \nbetween these two pdfs augmented with a multivariate Gaussian prior. Our resulting process is robust and it does not require point to point correspondence.\nExperimental results on synthetic and real data illustrates the performance of this novel approach.},\nacmid =  {2534013}, \npublisher =  {ACM}, \naddress =  {New York, NY, USA}, \nkeywords =  {3D face reconstruction, L2E, RGB-D sensor, computer vision, divergence, morphable models, registration, shape fitting}}\n\n
\n
\n\n\n
\n We propose to fit automatically a 3D morphable face model to a point cloud captured with a RGB-D sensor. Both data sets, the shape model and the target point cloud are modelled as two probability density functions (pdfs). Rigid registration (rotation and translation) and reconstruction on the model is performed by minimising the Euclidean distance between these two pdfs augmented with a multivariate Gaussian prior. Our resulting process is robust and it does not require point to point correspondence. Experimental results on synthetic and real data illustrates the performance of this novel approach.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2012\n \n \n (7)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Extrinsic camera parameters estimation for shape-from-depths.\n \n \n \n \n\n\n \n Ruttle, J.; Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pages 1985-1989, Aug 2012. \n \n\n\n\n
\n\n\n\n \n \n \"ExtrinsicPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{6334154, \nauthor =  {J. Ruttle and C. Arellano and R. Dahyot}, \nbooktitle =  {2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)}, \ntitle =  {Extrinsic camera parameters estimation for shape-from-depths}, \nurl = {https://www.eurasip.org/Proceedings/Eusipco/Eusipco2012/Conference/papers/1569583097.pdf},\neprint = {https://ieeexplore.ieee.org/document/6334154},\nabstract = {3D reconstruction from multiple view images requires that\ncamera parameters are very accurately known and standard\ncamera calibration techniques [1] often fail to provide the required level of accuracy for the extrinsic camera parameters.\nUsing the Kinect depth camera, we propose to estimate camera parameters by minimising the cross correlation between\ndensity functions modelled for each recorded depth images.\nWe illustrate experimentally how this improves the modelling\nfor estimating 3D shape from Depths.},\nyear =  {2012}, \nvolume =  {}, \nnumber =  {}, \npages =  {1985-1989}, \nkeywords =  {calibration;cameras;correlation methods;image reconstruction;parameter estimation;extrinsic camera parameter estimation;shape-from-depths;3D reconstruction;multiple view images;standard camera calibration techniques;kinect depth camera;recorded depth images;Cameras;Calibration;Cost function;Shape;Probability density function;Computational modeling;Correlation;Shape-from-Silhouettes (SfS);Shape-from-Depths (SfD);Multiview geometry},\nISSN =  {2219-5491},\nmonth =  {Aug}}\n\n
\n
\n\n\n
\n 3D reconstruction from multiple view images requires that camera parameters are very accurately known and standard camera calibration techniques [1] often fail to provide the required level of accuracy for the extrinsic camera parameters. Using the Kinect depth camera, we propose to estimate camera parameters by minimising the cross correlation between density functions modelled for each recorded depth images. We illustrate experimentally how this improves the modelling for estimating 3D shape from Depths.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Mean shift algorithm for robust rigid registration between Gaussian Mixture Models.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pages 1154-1158, Aug 2012. \n \n\n\n\n
\n\n\n\n \n \n \"MeanPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{6334159, \nauthor =  {C. {Arellano} and R. {Dahyot}}, \nbooktitle =  {2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)},\ntitle =  {Mean shift algorithm for robust rigid registration between Gaussian Mixture Models}, \nyear =  {2012}, \nvolume =  {}, \nnumber =  {}, \nurl = {https://www.eurasip.org/Proceedings/Eusipco/Eusipco2012/Conference/papers/1569583125.pdf},\neprint = {https://ieeexplore.ieee.org/document/6334159},\nabstract = {We present a Mean shift (MS) algorithm for solving the rigid point set transformation estimation [1]. \nOur registration algorithm minimises exactly the Euclidean distance between Gaussian Mixture Models (GMMs). \nWe show experimentally that our algorithm is more robust than previous implementations [1], thanks to both using an annealing \nframework (to avoid local extrema) and using variable bandwidths in our density estimates. \nOur approach is applied to 3D real data sets captured with a Lidar scanner and Kinect sensor.},\npages =  {1154-1158},\nkeywords =  {Gaussian processes;image registration;image sensors;optical radar;optical scanners;mean shift algorithm;robust rigid registration algorithm;Gaussian mixture models;MS algorithm;rigid point set transformation estimation;Euclidean distance;GMM;annealing framework;density estimation;3D real data sets;lidar scanner;Kinect sensor;Bandwidth;Density functional theory;Robustness;Kernel;Estimation;Annealing;Cost function;Mean Shift;Registration;Gaussian Mixture Models;Rigid Transformation}, \nISSN =  {2219-5491},\nmonth =  {Aug}}\n\n
\n
\n\n\n
\n We present a Mean shift (MS) algorithm for solving the rigid point set transformation estimation [1]. Our registration algorithm minimises exactly the Euclidean distance between Gaussian Mixture Models (GMMs). We show experimentally that our algorithm is more robust than previous implementations [1], thanks to both using an annealing framework (to avoid local extrema) and using variable bandwidths in our density estimates. Our approach is applied to 3D real data sets captured with a Lidar scanner and Kinect sensor.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Bayesian Shape from Silhouettes.\n \n \n \n\n\n \n Kim, D.; and Dahyot, R.\n\n\n \n\n\n\n In Salerno, E.; Çetin, A. E.; and Salvetti, O., editor(s), Computational Intelligence for Multimedia Understanding, pages 78–89, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 4 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@InProceedings{KimMuscle2011, \nauthor =  {Kim, Donghoon and Dahyot, Rozenn}, \neditor =  {Salerno, Emanuele and {\\c{C}}etin, A. Enis and Salvetti, Ovidio},\ntitle =  {Bayesian Shape from Silhouettes}, \nbooktitle =  {Computational Intelligence for Multimedia Understanding}, \nyear =  {2012}, \npublisher =  {Springer Berlin Heidelberg}, \naddress =  {Berlin, Heidelberg},\npages =  {78--89},\nabstract =  {This paper extends the likelihood kernel density estimate of the visual hull proposed by Kim et al [1] by introducing a prior.\nInference of the shape is performed using a meanshift algorithm over a posterior kernel density function that is refined iteratively using\nboth a multiresolution framework (to avoid local maxima) and using KNN for selecting the best reconstruction basis at each iteration. \nThis approach allows us to recover concave areas of the shape that are usually lost when estimating the visual hull.}, \nisbn =  {978-3-642-32436-9}, \ndoi =  {10.1007/978-3-642-32436-9_7}}\n\n
\n
\n\n\n
\n This paper extends the likelihood kernel density estimate of the visual hull proposed by Kim et al [1] by introducing a prior. Inference of the shape is performed using a meanshift algorithm over a posterior kernel density function that is refined iteratively using both a multiresolution framework (to avoid local maxima) and using KNN for selecting the best reconstruction basis at each iteration. This approach allows us to recover concave areas of the shape that are usually lost when estimating the visual hull.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Feature Selection Using Visual Saliency for Content-Based Image Retrieval.\n \n \n \n \n\n\n \n Zdziarski, Z.; and Dahyot, R.\n\n\n \n\n\n\n In 23nd IET Irish Signals and Systems Conference, Maynooth, Ireland, June, 28-29 2012. \n https://ieeexplore.ieee.org/document/6621173/\n\n\n\n
\n\n\n\n \n \n \"FeaturePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@Inproceedings{ISSC2012Ziggy,\ntitle =  {Feature Selection Using Visual Saliency for Content-Based Image Retrieval},\nauthor =  {Z. Zdziarski and R. Dahyot}, \nbooktitle =  {23nd IET Irish Signals and Systems Conference},\nmonth =  {June, 28-29}, \nyear =  {2012}, \naddress =  {Maynooth, Ireland}, \ndoi =  {10.1049/ic.2012.0194}, \nabstract = {Saliency algorithms in content-based image retrieval are employed to retrieve the most important regions of an image \nwith the idea that these regions hold the essence of representative information. Such regions are then typically analysed and \ndescribed for future retrieval/classification tasks rather than the entire image itself - thus minimising computational resources required.\nWe show that we can select a small number\nof features for indexing using a visual saliency measure without reducing the performance of classifiers trained to find objects.},\nurl = {https://mural.maynoothuniversity.ie/15270/1/RD_feature.pdf},\nnote= {https://ieeexplore.ieee.org/document/6621173/}}\n\n
\n
\n\n\n
\n Saliency algorithms in content-based image retrieval are employed to retrieve the most important regions of an image with the idea that these regions hold the essence of representative information. Such regions are then typically analysed and described for future retrieval/classification tasks rather than the entire image itself - thus minimising computational resources required. We show that we can select a small number of features for indexing using a visual saliency measure without reducing the performance of classifiers trained to find objects.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n On Using Anisotropic Diffusion for Skeleton Extraction.\n \n \n \n \n\n\n \n Direkoglu, C.; Dahyot, R.; and Manzke, M.\n\n\n \n\n\n\n International Journal of Computer Vision, 100(2): 170–189. Nov 2012.\n \n\n\n\n
\n\n\n\n \n \n \"OnPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{IJCV2012Cem, \nauthor =  {Direkoglu, Cem and Dahyot, Rozenn and Manzke, Michael},\ntitle =  {On Using Anisotropic Diffusion for Skeleton Extraction}, \njournal =  {International Journal of Computer Vision}, \nyear =  {2012}, \nmonth =  {Nov}, \nday =  {01}, \nvolume =  {100}, \nnumber =  {2}, \npages =  {170--189}, \nabstract =  {We present a novel and effective skeletonization algorithm for binary and gray-scale images, \nbased on the anisotropic heat diffusion analogy. We diffuse the image in the direction normal to the feature \nboundaries and also allow tangential diffusion (curvature decreasing diffusion) to contribute slightly. \nThe proposed anisotropic diffusion provides a high quality medial function in the image: it removes noise and preserves \nprominent curvatures of the shape along the level-sets (skeleton features). The skeleton strength map, which provides the \nlikelihood of a point to be part of the skeleton, is defined by the mean curvature measure. Finally, thin and binary skeleton\nis obtained by non-maxima suppression and hysteresis thresholding of the skeleton strength map. Our method outperforms the most\nrelated and the popular methods in skeleton extraction especially in noisy conditions. Results show that the proposed approach \nis better at handling noise in images and preserving the skeleton features at the centerline of the shape.},\nissn =  {1573-1405}, \ndoi =  {10.1007/s11263-012-0540-9},\nurl =  {https://mural.maynoothuniversity.ie/15120/1/RD_on%20using.pdf}}\n\n
\n
\n\n\n
\n We present a novel and effective skeletonization algorithm for binary and gray-scale images, based on the anisotropic heat diffusion analogy. We diffuse the image in the direction normal to the feature boundaries and also allow tangential diffusion (curvature decreasing diffusion) to contribute slightly. The proposed anisotropic diffusion provides a high quality medial function in the image: it removes noise and preserves prominent curvatures of the shape along the level-sets (skeleton features). The skeleton strength map, which provides the likelihood of a point to be part of the skeleton, is defined by the mean curvature measure. Finally, thin and binary skeleton is obtained by non-maxima suppression and hysteresis thresholding of the skeleton strength map. Our method outperforms the most related and the popular methods in skeleton extraction especially in noisy conditions. Results show that the proposed approach is better at handling noise in images and preserving the skeleton features at the centerline of the shape.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Shape Model Fitting Using non-Isotropic GMM.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In IET Irish Signals and Systems Conference (ISSC 2012), pages 1-6, Maynooth, Ireland, June, 28-29 2012. \n \n\n\n\n
\n\n\n\n \n \n \"ShapePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@Inproceedings{ISSC2012Claudia,\ntitle =  {Shape Model Fitting Using non-Isotropic GMM},\nauthor =  {C. Arellano and R. Dahyot}, \nbooktitle =  {IET Irish Signals and Systems Conference (ISSC 2012)},\nyear =  {2012}, \nmonth =  {June, 28-29}, \npages =  {1-6}, \naddress =  {Maynooth, Ireland},\nurl =  {https://mural.maynoothuniversity.ie/15273/1/RD_shape.pdf}, \ndoi =  {10.1049/ic.2012.0196}, \nabstract = {We present a Mean Shift algorithm for fitting shape models. This algorithm maximises a posterior density function \nwhere the likelihood is defined as the Euclidean distance between two Gaussian mixture density functions, one modelling the observations \nwhile the other corresponds to the shape model. We explore the role of the covariance matrix in the Gaussian kernel for encoding the shape \nof the model in the density function. Results show that using non-isotropic covariance matrices improve the efficiency of \nthe algorithm and allow to reduce the number of kernels to use in the mixture without compromising the robustness of the algorithm.},\nkeywords =  {Gaussian processes;covariance matrices;solid modelling;Euclidean distance;Gaussian kernel;Gaussian mixture density functions;mean shift algorithm;nonisotropic GMM;nonisotropic covariance matrices;posterior density function;shape model fitting;Fitting Algorithm;Gaussian Mixture Models;Mean Shift;Morphable Models}}\n\n
\n
\n\n\n
\n We present a Mean Shift algorithm for fitting shape models. This algorithm maximises a posterior density function where the likelihood is defined as the Euclidean distance between two Gaussian mixture density functions, one modelling the observations while the other corresponds to the shape model. We explore the role of the covariance matrix in the Gaussian kernel for encoding the shape of the model in the density function. Results show that using non-isotropic covariance matrices improve the efficiency of the algorithm and allow to reduce the number of kernels to use in the mixture without compromising the robustness of the algorithm.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Shape model fitting algorithm without point correspondence.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pages 934-938, Aug 2012. \n \n\n\n\n
\n\n\n\n \n \n \"ShapePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@Inproceedings{Eusipco2012Arellano1, \nauthor =  {C. Arellano and R. Dahyot}, \nbooktitle =  {2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)}, \ntitle =  {Shape model fitting algorithm without point correspondence}, \nyear =  {2012}, \nvolume =  {}, \nnumber =  {}, \npages =  {934-938}, \nabstract = {In this paper, we present a Mean Shift algorithm that does\nnot require point correspondence to fit shape models. The observed data and the shape model are represented as mixtures\nof Gaussians. Using a Bayesian framework, we propose to\nmodel the likelihood using the Euclidean distance between\nthe two Gaussian mixture density functions while the latent\nvariables are modelled with a Gaussian prior. We show the\nperformance of our MS algorithm for fitting a 2D hand model\nand a 3D Morphable Model of faces to point clouds.},\nkeywords =  {Gaussian processes;shape recognition;2D hand model;3D morphable model;Bayesian framework;Euclidean distance;Gaussian mixture density functions;Gaussian prior;Gaussians mixtures;MS algorithm;mean shift algorithm;shape model fitting algorithm;Computational modeling;Data models;Euclidean distance;Robustness;Shape;Signal processing algorithms;Solid modeling;Gaussian Mixture Models;Mean Shift;Morphable Models;Shape Fitting}, \ndoi =  {}, \nISSN =  {2219-5491}, \nmonth =  {Aug},\nurl =  {https://www.eurasip.org/Proceedings/Eusipco/Eusipco2012/Conference/papers/1569582293.pdf}, \neprint =  {https://ieeexplore.ieee.org/document/6333999/}}\n\n
\n
\n\n\n
\n In this paper, we present a Mean Shift algorithm that does not require point correspondence to fit shape models. The observed data and the shape model are represented as mixtures of Gaussians. Using a Bayesian framework, we propose to model the likelihood using the Euclidean distance between the two Gaussian mixture density functions while the latent variables are modelled with a Gaussian prior. We show the performance of our MS algorithm for fitting a 2D hand model and a 3D Morphable Model of faces to point clouds.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2010\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Synthesizing Structured Image Hybrids.\n \n \n \n \n\n\n \n Risser, E.; Han, C.; Dahyot, R.; and Grinspun, E.\n\n\n \n\n\n\n ACM Trans. Graph., 29(4): 85:1–85:6. jul 2010.\n \n\n\n\n
\n\n\n\n \n \n \"SynthesizingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{Risser2010, \nauthor =  {Risser, Eric and Han, Charles and Dahyot, Rozenn and Grinspun, Eitan}, \ntitle =  {Synthesizing Structured Image Hybrids}, \njournal =  {ACM Trans. Graph.},\nissue_date =  {July 2010}, \nvolume =  {29}, \nnumber =  {4}, \nmonth =  {jul}, \nyear =  {2010},\nissn =  {0730-0301}, \npages =  {85:1--85:6}, \narticleno =  {85}, \nnumpages =  {6}, \neprint =  {http://doi.acm.org/10.1145/1778765.1778822},\nurl = {http://www.cs.columbia.edu/cg/hybrids/hybrids.pdf},\ndoi =  {10.1145/1778765.1778822}, \nabstract = {Example-based texture synthesis algorithms generate novel texture images from example data.\nA popular hierarchical pixel-based approach uses spatial jitter to introduce diversity, at the risk of breaking coarse structure beyond repair. \nWe propose a multiscale descriptor that enables appearance-space jitter, which retains structure. \nThis idea enables repurposing of existing texture synthesis\nimplementations for a qualitatively different problem statement and class of inputs: generating hybrids of structured images.},\nacmid =  {1778822}, \npublisher =  {ACM}, address =  {New York, NY, USA}}\n\n
\n
\n\n\n
\n Example-based texture synthesis algorithms generate novel texture images from example data. A popular hierarchical pixel-based approach uses spatial jitter to introduce diversity, at the risk of breaking coarse structure beyond repair. We propose a multiscale descriptor that enables appearance-space jitter, which retains structure. This idea enables repurposing of existing texture synthesis implementations for a qualitatively different problem statement and class of inputs: generating hybrids of structured images.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n 3D shape estimation from silhouettes using Mean-shift.\n \n \n \n \n\n\n \n Kim, D.; Ruttle, J.; and Dahyot, R.\n\n\n \n\n\n\n In IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2010) , pages 1430 -1433, March 2010. \n \n\n\n\n
\n\n\n\n \n \n \"3DPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Kim2010Icassp, \nauthor =  {D. Kim and  J. Ruttle and R. Dahyot}, \nbooktitle =  {IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2010) }, \ntitle =  {3D shape estimation from silhouettes using Mean-shift}, \nyear =  {2010}, \nmonth =  {March}, \nvolume =  {}, \nnumber =  {}, \npages =  {1430 -1433}, \nkeywords =  {},\nurl={https://mural.maynoothuniversity.ie/15281/1/RD_3D%20shape.pdf},\ndoi =  {10.1109/ICASSP.2010.5495474}, \nabstract = {In this article, a novel method to accurately estimate 3D surface of objects of interest is proposed. \nEach ray projected from 2D image plane to 3D space is modelled with the Gaussian kernel function. \nThen a mean shift algorithm with an annealing scheme is used to find maximums of the probability density function and recovers the 3D surface. \nExperimental results show that our method is more accurate to estimate 3D surface than the Radon transform-based approach.},\nISSN =  {1520-6149}}\n\n
\n
\n\n\n
\n In this article, a novel method to accurately estimate 3D surface of objects of interest is proposed. Each ray projected from 2D image plane to 3D space is modelled with the Gaussian kernel function. Then a mean shift algorithm with an annealing scheme is used to find maximums of the probability density function and recovers the 3D surface. Experimental results show that our method is more accurate to estimate 3D surface than the Radon transform-based approach.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Détection et Reconnaissance de la signalisation verticale par analyse d'images.\n \n \n \n\n\n \n Charbonnier, P.; Dahyot, R.; Vik, T.; and Heitz, F.\n\n\n \n\n\n\n Detection et reconnaissance de la signalisation verticale par analyse d’images (Ed: P. Foucher). Etudes et Recherches des laboratoires des Ponts et Chaussées, CR53 ( ISBN 978-2-7208-2578-1), July 2010.\n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inbook{LCPC2010, \nChapter =  {D\\'{e}tection et Reconnaissance de la signalisation verticale par analyse d'images}, \nauthor =  {P. Charbonnier and R. Dahyot and T. Vik and F. Heitz},\ntitle =  {Detection et reconnaissance de la signalisation verticale par analyse d’images (Ed: P. Foucher)},\npublisher =  {Etudes et Recherches des laboratoires des Ponts et Chaussées, CR53 ( ISBN 978-2-7208-2578-1)},\nmonth =  {July},\nyear =  {2010}, }\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Skeleton Extraction via Anisotropic Heat Flow.\n \n \n \n \n\n\n \n Direkoglu, C.; Dahyot, R.; and Manzke, M.\n\n\n \n\n\n\n In Proceedings of the British Machine Vision Conference, pages 61.1–61.11, 2010. BMVA Press\n \n\n\n\n
\n\n\n\n \n \n \"SkeletonPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Direkoglu2010,\ntitle =  {Skeleton Extraction via Anisotropic Heat Flow},\nauthor =  {Direkoglu, Cem and Dahyot, Rozenn and Manzke, Michael}, \nyear =  {2010}, \npages =  {61.1--61.11},\nbooktitle =  {Proceedings of the British Machine Vision Conference},\npublisher =  {BMVA Press},\neditors =  {Labrosse, Fr\\'ed\\'eric and Zwiggelaar, Reyer and Liu, Yonghuai and Tiddeman, Bernie}, \nisbn =  {1-901725-40-5},\nurl = {http://www.bmva.org/bmvc/2010/conference/paper61/paper61.pdf},\nabstract = {We introduce a novel skeleton extraction algorithm in binary and gray-scale images,\nbased on the anisotropic heat diffusion analogy. We propose to diffuse image in the dominance of direction normal to the feature boundaries and also allow tangential diffusion\nto contribute slightly. The proposed anisotropic diffusion provides a high quality medial function in the image, since it removes noise and preserves prominent curvatures\nof the shape along the level-sets (skeleton locations). Then the skeleton strength map,\nwhich provides the likelihood to be a skeleton point, is obtained by computing the mean\ncurvature of level-sets. The overall process is completed by non-maxima suppression\nand hysteresis thresholding to obtain thin and binary skeleton. Results indicate that this\napproach has advantages in handling noise in the image and in obtaining smooth shape\nskeleton because of the directional averaging inherent of our new anisotropic heat flow.},\ndoi =  {10.5244/C.24.61}}\n\n
\n
\n\n\n
\n We introduce a novel skeleton extraction algorithm in binary and gray-scale images, based on the anisotropic heat diffusion analogy. We propose to diffuse image in the dominance of direction normal to the feature boundaries and also allow tangential diffusion to contribute slightly. The proposed anisotropic diffusion provides a high quality medial function in the image, since it removes noise and preserves prominent curvatures of the shape along the level-sets (skeleton locations). Then the skeleton strength map, which provides the likelihood to be a skeleton point, is obtained by computing the mean curvature of level-sets. The overall process is completed by non-maxima suppression and hysteresis thresholding to obtain thin and binary skeleton. Results indicate that this approach has advantages in handling noise in the image and in obtaining smooth shape skeleton because of the directional averaging inherent of our new anisotropic heat flow.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Smooth Kernel Density Estimate for Multiple View Reconstruction.\n \n \n \n \n\n\n \n Ruttle, J.; Manzke, M.; and Dahyot, R.\n\n\n \n\n\n\n In proceedings of The 7th European Conference for Visual Media Production, CVMP 2010, pages 74 -81, 17 - 18 November 2010. \n \n\n\n\n
\n\n\n\n \n \n \"SmoothPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Ruttle2010CVMP, \ntitle =  {Smooth Kernel Density Estimate for Multiple View Reconstruction}, \nauthor =  {J. Ruttle and M. Manzke and R. Dahyot},\nbooktitle =  {proceedings of The 7th European Conference for Visual Media Production, CVMP 2010}, \nlocation =  {London UK}, \nmonth =  {17 - 18 November}, \npages =  {74 -81}, \nyear =  {2010},\nabstract = {We present a statistical framework to merge the information from silhouettes segmented in multiple view images to infer the 3D shape of an object. \nThe approach is generalising the robust but discrete modelling of the visual hull by using the concept of averaged likelihoods.\nOne resulting advantage of our framework is that the objective function is continuous and therefore an iterative gradient ascent algorithm\ncan be defined to efficiently search the space. Moreover this results in a method which is less memory demanding and one that is very suitable \nto a parallel processing architecture.\nExperimental results shows that this approach is efficient for getting a robust initial guess to the 3D shape of an object in view.},\nurl={https://mural.maynoothuniversity.ie/15282/1/RD_smooth.pdf},\ndoi =  {10.1109/CVMP.2010.17}}\n\n
\n
\n\n\n
\n We present a statistical framework to merge the information from silhouettes segmented in multiple view images to infer the 3D shape of an object. The approach is generalising the robust but discrete modelling of the visual hull by using the concept of averaged likelihoods. One resulting advantage of our framework is that the objective function is continuous and therefore an iterative gradient ascent algorithm can be defined to efficiently search the space. Moreover this results in a method which is less memory demanding and one that is very suitable to a parallel processing architecture. Experimental results shows that this approach is efficient for getting a robust initial guess to the 3D shape of an object in view.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Stereo Images for 3D Face Applications: A Literature Review.\n \n \n \n \n\n\n \n Arellano, C.; and Dahyot, R.\n\n\n \n\n\n\n In International Machine Vision and Image Processing Conference (IMVIP 2010), Limerick Ireland, September 2010. \n \n\n\n\n
\n\n\n\n \n \n \"StereoPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Arellano2010, \ntitle =  {Stereo Images for 3D Face Applications: A Literature Review}, \nauthor =  {C. Arellano and R. Dahyot}, \nbooktitle =  {International Machine Vision and Image Processing Conference (IMVIP 2010)}, \naddress =  {Limerick Ireland}, \nmonth =  {September}, \nyear =  {2010}, \nurl =  {http://citeseerx.ist.psu.edu/viewdoc/download?doi = 10.1.1.394.4985&rep = rep1&type = pdf}}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2009\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n 3D Head Reconstruction using Multi-camera Stream.\n \n \n \n \n\n\n \n Kim, D.; and Dahyot, R.\n\n\n \n\n\n\n In International Machine Vision and Image Processing conference (IMVIP 2009), pages 156-161, Dublin, Ireland, September 2009. \n \n\n\n\n
\n\n\n\n \n \n \"3DPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Kim09IMVIP,\ntitle =  {3D Head Reconstruction using Multi-camera Stream},\nauthor =  {D. Kim and R. Dahyot},\nbooktitle =  {International Machine Vision and Image Processing conference (IMVIP 2009)}, \npages =  {156-161}, \naddress =  {Dublin, Ireland}, \nmonth =  {September},\nyear =  {2009},\nabstract = {Given information from many cameras, one can hope to get a complete 3D representation of an object. \nPintavirooj and Sangworasil exploit this idea and present a system that records sequentially images from multiple view points\nto reconstruct a 3D shape of a static object of interest [1]. For instance, using a 60 angle of view on the image, \nthey manage to get its accurate 3D reconstruction [1]. Unfortunately, when considering application such as video surveillance,\nit is not reasonable to expect that 60 cameras will give simultaneous images of a person of interest. However, we can expect that \nthe person will move over time and show sequentially different poses of her/his head to at least one or a few cameras.\nThis article proposes a technique for recovering an accurate 3D shape by combining views recorded at different times.},\nurl =  {https://ieeexplore.ieee.org/document/5319298/}, \ndoi =  {10.1109/IMVIP.2009.35}}\n\n
\n
\n\n\n
\n Given information from many cameras, one can hope to get a complete 3D representation of an object. Pintavirooj and Sangworasil exploit this idea and present a system that records sequentially images from multiple view points to reconstruct a 3D shape of a static object of interest [1]. For instance, using a 60 angle of view on the image, they manage to get its accurate 3D reconstruction [1]. Unfortunately, when considering application such as video surveillance, it is not reasonable to expect that 60 cameras will give simultaneous images of a person of interest. However, we can expect that the person will move over time and show sequentially different poses of her/his head to at least one or a few cameras. This article proposes a technique for recovering an accurate 3D shape by combining views recorded at different times.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Estimating 3D Scene Flow from Multiple 2D Optical Flows.\n \n \n \n \n\n\n \n Ruttle, J.; Manzke, M.; and Dahyot, R.\n\n\n \n\n\n\n In International Machine Vision and Image Processing Conference (IMVIP 2009), pages 6-11, Dublin, Ireland, September 2009. \n URI: http://hdl.handle.net/2262/30634\n\n\n\n
\n\n\n\n \n \n \"EstimatingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Ruttle09Imvip, \ntitle =  {Estimating 3D Scene Flow from Multiple 2D Optical Flows}, \nauthor =  {J. Ruttle and M. Manzke and R. Dahyot}, \nbooktitle =  {International Machine Vision and Image Processing Conference (IMVIP 2009)}, \npages =  {6-11}, \naddress =  {Dublin, Ireland}, \nmonth =  {September},\nyear =  {2009}, \nabstract = {Scene flow is the motion of the surface points in the 3D world. For a camera,\nit is seen as a 2D optical flow in the image plane. Knowing the scene flow can be very useful as it gives an idea of \nthe surface geometry of the objects in the scene and how those objects are moving. Four methods for calculating the scene \nflow given multiple optical flows have been explored and detailed in this paper along with the basic mathematics surrounding \nmulti-view geometry. It was found that given multiple optical flows\nit is possible to estimate the scene flow to different levels of detail depending on the level of prior information present.},\nurl =  {https://mural.maynoothuniversity.ie/15285/1/RD_estimating.pdf},\nnote = {URI: http://hdl.handle.net/2262/30634},\ndoi =  {10.1109/IMVIP.2009.8}}
\n
\n\n\n
\n Scene flow is the motion of the surface points in the 3D world. For a camera, it is seen as a 2D optical flow in the image plane. Knowing the scene flow can be very useful as it gives an idea of the surface geometry of the objects in the scene and how those objects are moving. Four methods for calculating the scene flow given multiple optical flows have been explored and detailed in this paper along with the basic mathematics surrounding multi-view geometry. It was found that given multiple optical flows it is possible to estimate the scene flow to different levels of detail depending on the level of prior information present.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Mean-shift for Statistical Hough Transform.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Technical Report 01/09, School of Computer Science and Statistics, Trinity College Dublin, April 2009.\n \n\n\n\n
\n\n\n\n \n \n \"Mean-shiftPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{Dahyot09TR, \ntitle =  {Mean-shift for Statistical Hough Transform},\nauthor =  {R. Dahyot},\nnumber =  {01/09}, \ninstitution =  {School of Computer Science and Statistics, Trinity College Dublin},\nmonth =  {April}, \nyear =  {2009},\nurl =  {https://www.scss.tcd.ie/disciplines/statistics/tech-reports/09-01.pdf}}\n\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust Panning Analysis for Slideshow Detection in Video Databases.\n \n \n \n \n\n\n \n Zdziarski, Z.; and Dahyot, R.\n\n\n \n\n\n\n In International Machine Vision and Image Processing Conference (IMVIP 2009), pages 89-93, Dublin, Ireland, September 2009. \n \n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Zdziarski09Imvip,\ntitle =  {Robust Panning Analysis for Slideshow Detection in Video Databases}, \nauthor =  {Z. Zdziarski and R. Dahyot}, \nbooktitle =  {International Machine Vision and Image Processing Conference (IMVIP 2009)}, \npages =  {89-93}, \naddress =  {Dublin, Ireland}, \nmonth =  {September}, \nyear =  {2009}, \nabstract = {We present an algorithm for slideshow detection in video databases such as YouTube or Blip.TV.\nOur solution is based around feature tracking to extract movement between sequentially captured frames.\nThis movement is then analysed through the use of the Hough transform and compared against behaviour commonly exhibited \nby slideshows: still and panning static images.\nWe show experimentally the effectiveness of this novel idea and approach.},\nurl =  {https://mural.maynoothuniversity.ie/15284/1/RD_robust.pdf},\ndoi =  {10.1109/IMVIP.2009.23}}\n\n
\n
\n\n\n
\n We present an algorithm for slideshow detection in video databases such as YouTube or Blip.TV. Our solution is based around feature tracking to extract movement between sequentially captured frames. This movement is then analysed through the use of the Hough transform and compared against behaviour commonly exhibited by slideshows: still and panning static images. We show experimentally the effectiveness of this novel idea and approach.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Statistical Hough Transform.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n IEEE Transactions on Pattern Analysis and Machine Intelligence, (8): 1502-1509. Aug 2009.\n URI: http://hdl.handle.net/2262/31106 - Github: https://github.com/Roznn/Statistical-Hough-Transform\n\n\n\n
\n\n\n\n \n \n \"StatisticalPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@article{Dahyot08pami, \nauthor =  {R. Dahyot}, \njournal =  {IEEE Transactions on Pattern Analysis and Machine Intelligence},\ntitle =  {Statistical Hough Transform},\nyear =  {2009}, \nolume =  {31}, \nnumber =  {8},\npages =  {1502-1509}, \nnote = {URI: http://hdl.handle.net/2262/31106 - Github: https://github.com/Roznn/Statistical-Hough-Transform},\nurl = {https://mural.maynoothuniversity.ie/15126/1/RD_stat.pdf},\nabstract = {The standard Hough transform is a popular method in image processing and is traditionally estimated using histograms.\nDensities modeled with histograms in high dimensional space and/or with few observations, can be very sparse and highly demanding in memory.\nIn this paper, we propose first to extend the formulation to continuous kernel estimates. Second, when dependencies in between variables are well\ntaken into account, the estimated density is also robust to noise and insensitive to the choice of the origin of the spatial coordinates.\nFinally, our new statistical framework is unsupervised (all needed parameters are automatically estimated) and flexible\n(priors can easily be attached to the observations). We show experimentally that our new modeling encodes better the alignment content of images.},\nkeywords =  {Hough transforms;object detection;statistical analysis;continuous kernel estimate;image processing;line detection;spatial domain coordinate;statistical Hough transform;Hough transform;Image Processing and Computer Vision;Radon transform;Transform methods;kernel probability density function;line detection.;uncertainty},\ndoi =  {10.1109/TPAMI.2008.288},\neprint = {http://www.tara.tcd.ie/handle/2262/31106},\nISSN =  {0162-8828}, \nmonth =  {Aug}}\n\n
\n
\n\n\n
\n The standard Hough transform is a popular method in image processing and is traditionally estimated using histograms. Densities modeled with histograms in high dimensional space and/or with few observations, can be very sparse and highly demanding in memory. In this paper, we propose first to extend the formulation to continuous kernel estimates. Second, when dependencies in between variables are well taken into account, the estimated density is also robust to noise and insensitive to the choice of the origin of the spatial coordinates. Finally, our new statistical framework is unsupervised (all needed parameters are automatically estimated) and flexible (priors can easily be attached to the observations). We show experimentally that our new modeling encodes better the alignment content of images.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Synchronized real-time multi-sensor motion capture system.\n \n \n \n\n\n \n Ruttle, J.; Manzke, M.; Prazak, M.; and Dahyot, R.\n\n\n \n\n\n\n In SIGGRAPH ASIA '09: ACM SIGGRAPH ASIA 2009 Posters, pages 1–1, New York, NY, USA, 2009. ACM\n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Ruttle09Siggraph, \nauthor =  {J. Ruttle and M. Manzke and M. Prazak and R. Dahyot}, \ntitle =  {Synchronized real-time multi-sensor motion capture system},\nbooktitle =  {SIGGRAPH ASIA '09: ACM SIGGRAPH ASIA 2009 Posters},\nyear =  {2009}, \npages =  {1--1}, \nlocation =  {Yokohama, Japan}, \nabstract = {This work addresses the challenge of synchronizing multiple sources of visible and audible information from a variety of devices,\nwhile capturing human motion in realtime. Video and audio data will be used to augment and enrich a motion capture database\nthat will be released to the research community. While other such augmented motion capture databases exist [Black and Sigal 2006], \nthe goal of this work is to build on these previous works. Critical areas of improvement are in the synchronization between cameras \nand synchronization between devices. Adding an array of audio recording devices to the setup will greatly expand the research\npotential of the database, and the positioning of the cameras will be varied to give greater flexibility. The augmented database will \nfacilitate the testing and validation of human pose estimation and motion tracking techniques, among other applications. \nThis sketch briefly describes some of the interesting\nchallenges faced in setting up the pipeline for capturing the synchronized data and the novel approaches proposed to solve them.},\ndoi =  {10.1145/1666778.1666828}, \npublisher =  {ACM}, \naddress =  {New York, NY, USA}}\n\n
\n
\n\n\n
\n This work addresses the challenge of synchronizing multiple sources of visible and audible information from a variety of devices, while capturing human motion in realtime. Video and audio data will be used to augment and enrich a motion capture database that will be released to the research community. While other such augmented motion capture databases exist [Black and Sigal 2006], the goal of this work is to build on these previous works. Critical areas of improvement are in the synchronization between cameras and synchronization between devices. Adding an array of audio recording devices to the setup will greatly expand the research potential of the database, and the positioning of the cameras will be varied to give greater flexibility. The augmented database will facilitate the testing and validation of human pose estimation and motion tracking techniques, among other applications. This sketch briefly describes some of the interesting challenges faced in setting up the pipeline for capturing the synchronized data and the novel approaches proposed to solve them.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2008\n \n \n (7)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Audio-Visual Processing Tools for Auditory Scene Synthesis.\n \n \n \n \n\n\n \n Kearney, G.; Dahyot, R.; and Boland, F.\n\n\n \n\n\n\n In Audio Engineering Society 134th Convention, May 2008. \n \n\n\n\n
\n\n\n\n \n \n \"Audio-VisualPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{KearneyAES08,\ntitle =  {Audio-Visual Processing Tools for Auditory Scene Synthesis}, \nauthor =  {G. Kearney and R. Dahyot and F. Boland}, \nbooktitle =  {Audio Engineering Society 134th Convention}, \nmonth =  {May}, \nyear =  {2008}, \nabstract = {We present an integrated set of audio-visual tracking and synthesis tools to aid matching of the audio to the video position\nin both horizontal and periphonic sound reinforcement systems. Compensation for screen size and loudspeaker layout for high definition formats \nis incorporated and the spatial localisation of the source is rendered using advanced spatialisation techniques. A subjective comparison \nof several original and enhanced film sequences using the Vector Base Amplitude Panning (VBAP) method is presented. The results show that\nthe encoding of non-contradictory audio-visual spatial information,\nfor presentation on different loudspeaker layouts significantly improves the naturalness of the listening/viewing experience.},\nurl =  {http://www.aes.org/e-lib/browse.cfm?elib = 14495}}\n\n
\n
\n\n\n
\n We present an integrated set of audio-visual tracking and synthesis tools to aid matching of the audio to the video position in both horizontal and periphonic sound reinforcement systems. Compensation for screen size and loudspeaker layout for high definition formats is incorporated and the spatial localisation of the source is rendered using advanced spatialisation techniques. A subjective comparison of several original and enhanced film sequences using the Vector Base Amplitude Panning (VBAP) method is presented. The results show that the encoding of non-contradictory audio-visual spatial information, for presentation on different loudspeaker layouts significantly improves the naturalness of the listening/viewing experience.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Bayesian Classification for the Statistical Hough Transform.\n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n In 2008 19th International Conference on Pattern Recognition, pages 1 -4, Tampa, Florida, December 2008. \n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@inproceedings{Dahyoticpr08,\ntitle =  {Bayesian Classification for the Statistical Hough Transform}, \nauthor =  {R. Dahyot},\nbooktitle =  {2008 19th International Conference on Pattern Recognition},\nmonth =  {December}, \naddress =  {Tampa, Florida}, \nyear =  {2008},\nkeywords =  {Bayes methods;Hough transforms;Radon transforms;image classification;image segmentation;statistical analysis;2D accumulator histogram;Bayesian classification scheme;image space;inverse Radon transform;kernel mixture;statistical Hough transform;Bandwidth;Bayesian methods;Computer science;Discrete transforms;Educational institutions;Histograms;Image segmentation;Kernel;Robustness;Statistics}, \npages =  {1 -4}, \nabstract = {We have introduced the statistical Hough transform that extends the standard Hough transform by using a kernel mixture \nas a robust alternative to the 2 dimensional accumulator histogram. This work develops further this framework by proposing a \nBayesian classification scheme to associate the spatial coordinates (x, y) to one particular class defined in the Hough space. \nIn a first step, we segment the Hough space into meaningful classes. Then using the inverse Radon transform,\nwe backproject the different classes into the image space. We illustrate our approach on a synthetic image and on real images.},\ndoi =  {10.1109/ICPR.2008.4761109},\nISSN =  {1051-4651}}\n\n
\n
\n\n\n
\n We have introduced the statistical Hough transform that extends the standard Hough transform by using a kernel mixture as a robust alternative to the 2 dimensional accumulator histogram. This work develops further this framework by proposing a Bayesian classification scheme to associate the spatial coordinates (x, y) to one particular class defined in the Hough space. In a first step, we segment the Hough space into meaningful classes. Then using the inverse Radon transform, we backproject the different classes into the image space. We illustrate our approach on a synthetic image and on real images.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Face components detection using SURF descriptor and SVMs.\n \n \n \n \n\n\n \n Kim, D.; and Dahyot, R.\n\n\n \n\n\n\n In International Machine Vision and Image Processing conference (IMVIP 2008), 2008. \n \n\n\n\n
\n\n\n\n \n \n \"FacePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Donghoon08imvip, \nauthor =  {D. Kim and R. Dahyot},\ntitle =  {Face components detection using SURF descriptor and SVMs},\nbooktitle =  {International Machine Vision and Image Processing conference (IMVIP 2008)},\nyear =  {2008}, \nabstract = {We present a feature-based method to classify salient points as belonging to objects in the face or background classes.\nWe use SURF local descriptors (speeded up robust features) to generate feature vectors and use SVMs (support vector machines) as classifiers. Our system consists of a two-layer hierarchy of SVMs classifiers. On the first layer, a single classifier checks whether feature vectors are from face images or not. On the second layer, component labeling is operated using each component classifier of eye, mouth, and nose. This approach has the advantage about operating time because windows scanning procedure is not needed. \nFinally, this system performs the procedure to apply geometrical constraints to labeled descriptors.\nWe show experimentally the efficiency of our approach.},\ndoi =  {10.1109/IMVIP.2008.15}, \nurl =  {https://mural.maynoothuniversity.ie/15287/1/RD_face.pdf}}\n\n
\n
\n\n\n
\n We present a feature-based method to classify salient points as belonging to objects in the face or background classes. We use SURF local descriptors (speeded up robust features) to generate feature vectors and use SVMs (support vector machines) as classifiers. Our system consists of a two-layer hierarchy of SVMs classifiers. On the first layer, a single classifier checks whether feature vectors are from face images or not. On the second layer, component labeling is operated using each component classifier of eye, mouth, and nose. This approach has the advantage about operating time because windows scanning procedure is not needed. Finally, this system performs the procedure to apply geometrical constraints to labeled descriptors. We show experimentally the efficiency of our approach.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Improving the Quality of Color Colonoscopy Videos.\n \n \n \n \n\n\n \n Dahyot, R.; Vilariño, F.; and Lacey, G.\n\n\n \n\n\n\n EURASIP Journal on Image and Video Processing, (1): 139429. Jan 2008.\n \n\n\n\n
\n\n\n\n \n \n \"ImprovingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{Dahyot08,\nauthor =  {Dahyot, Rozenn \nand Vilari{\\~{n}}o, Fernando\nand Lacey, Gerard}, \ntitle =  {Improving the Quality of Color Colonoscopy Videos}, \njournal =  {EURASIP Journal on Image and Video Processing},\nyear =  {2008}, \nmonth =  {Jan}, \nday =  {22},\nume =  {2008}, \nnumber =  {1},\npages =  {139429},\nabstract =  {Colonoscopy is currently one of the best methods to detect colorectal cancer. \nNowadays, one of the widely used colonoscopes has a monochrome chipset recording successively at 60Hz  and \ncomponents merged into one color video stream. Misalignments of the channels occur each time the camera moves, \nand this artefact impedes both online visual inspection by doctors and offline computer analysis of the image data.\nWe propose to restore this artefact by first equalizing the color channels and then performing a robust camera motion estimation and compensation.},\nissn =  {1687-5281}, \ndoi =  {10.1155/2008/139429},\nurl = {https://mural.maynoothuniversity.ie/15123/1/RD_improving.pdf},\neprint =  {https://jivp-eurasipjournals.springeropen.com/articles/10.1155/2008/139429}}\n\n
\n
\n\n\n
\n Colonoscopy is currently one of the best methods to detect colorectal cancer. Nowadays, one of the widely used colonoscopes has a monochrome chipset recording successively at 60Hz and components merged into one color video stream. Misalignments of the channels occur each time the camera moves, and this artefact impedes both online visual inspection by doctors and offline computer analysis of the image data. We propose to restore this artefact by first equalizing the color channels and then performing a robust camera motion estimation and compensation.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Introduction to Bayesian Methods and Decision Theory.\n \n \n \n \n\n\n \n Wilson, S. P.; Dahyot, R.; and Cunningham, P.\n\n\n \n\n\n\n Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval , pages 3–19. Springer Berlin Heidelberg (Eds: Cord, Matthieu and Cunningham, Pádraig), Berlin, Heidelberg, 2008.\n \n\n\n\n
\n\n\n\n \n \n \"MachinePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@Inbook{Wilson2008, \nauthor =  {Wilson, Simon P. and Dahyot, Rozenn\nand Cunningham, P{\\'a}draig}, \nchapter =  {Introduction to Bayesian Methods and Decision Theory}, \ntitle =  {Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval },\nyear =  {2008},\npublisher =  {Springer Berlin Heidelberg (Eds: Cord, Matthieu and Cunningham, P{\\'a}draig)}, \naddress =  {Berlin, Heidelberg}, \npages =  {3--19}, \nabstract =  {Bayesian methods are a class of statistical methods that have some appealing properties for solving problems in machine learning, \nparticularly when the process being modelled has uncertain or random aspects. In this chapter we look at the mathematical and philosophical basis\nfor Bayesian methods and how they relate to machine learning problems in multimedia. We also discuss the notion of decision theory, for making decisions \nunder uncertainty, that is closely related to Bayesian methods. The numerical methods needed to implement Bayesian solutions are also discussed.\nTwo specific applications of the Bayesian approach that are often used in machine learning -- na{\\"i}ve Bayes and Bayesian networks -- are then described\nin more detail.}, \nisbn =  {978-3-540-75171-7}, \ndoi =  {10.1007/978-3-540-75171-7{\\_1}}, \nurl =  {https://doi.org/10.1007/978-3-540-75171-7_1}}\n\n
\n
\n\n\n
\n Bayesian methods are a class of statistical methods that have some appealing properties for solving problems in machine learning, particularly when the process being modelled has uncertain or random aspects. In this chapter we look at the mathematical and philosophical basis for Bayesian methods and how they relate to machine learning problems in multimedia. We also discuss the notion of decision theory, for making decisions under uncertainty, that is closely related to Bayesian methods. The numerical methods needed to implement Bayesian solutions are also discussed. Two specific applications of the Bayesian approach that are often used in machine learning – naïve Bayes and Bayesian networks – are then described in more detail.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Action Recognition in Multimedia Streams.\n \n \n \n \n\n\n \n Dahyot, R.; Pitie, F.; Lennon, D.; Harte, N.; and Kokaram, A.\n\n\n \n\n\n\n Multimodal Processing and Interaction: Audio, Video, Text. Springer US (Eds: Maragos, Petros and Potamianos, Alexandros and Gros, Patrick), Boston, MA, 2008.\n \n\n\n\n
\n\n\n\n \n \n \"MultimodalPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@Inbook{DahyotChapter2008, \nauthor =  {Dahyot, Rozenn\nand Pitie, Fran{\\c{c}}ois\nand Lennon, Daire\nand Harte, Naomi\nand Kokaram, Anil},\nchapter =  {Action Recognition in Multimedia Streams},\ntitle =  {Multimodal Processing and Interaction: Audio, Video, Text}, \nyear =  {2008}, \npublisher =  {Springer US (Eds: Maragos, Petros and Potamianos, Alexandros and Gros, Patrick)},\naddress =  {Boston, MA}, \nisbn =  {978-0-387-76316-3},\ndoi =  {10.1007/978-0-387-76316-3{\\_}5}, \nurl =  {https://doi.org/10.1007/978-0-387-76316-3_5}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Enhancement of Digital Photographs Using Color Transfer Techniques.\n \n \n \n \n\n\n \n Pitie, F.; Kokaram, A.; and Dahyot, R.\n\n\n \n\n\n\n Single-Sensor Imaging: Methods and Applications for Digital Cameras. CRC Press Image Processing Series, Rastislav Lukac (Ed.) ISBN: 9781420054521, October 2008.\n Github: https://github.com/frcs/colour-transfer \n\n\n\n
\n\n\n\n \n \n \"Single-SensorPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@Inbook{PitieCRC2008, \ntitle =  {Single-Sensor Imaging: Methods and Applications for Digital Cameras}, \nchapter =  {Enhancement of Digital Photographs Using Color Transfer Techniques},\nauthor =  {F. Pitie and A. Kokaram and R. Dahyot}, \npublisher =  {CRC Press Image Processing Series, Rastislav Lukac (Ed.) ISBN: 9781420054521}, \nmonth =  {October}, \nyear =  {2008}, \nnote = {Github: https://github.com/frcs/colour-transfer },\nurl = {https://github.com/frcs/colour-transfer/blob/master/publications/pitie08bookchapter.pdf},\ndoi =  {10.1201/9781420054538.ch11}}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2007\n \n \n (5)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Automated colour grading using colour distribution transfer.\n \n \n \n \n\n\n \n Pitie, F.; Kokaram, A. C.; and Dahyot, R.\n\n\n \n\n\n\n Computer Vision and Image Understanding, 107(1): 123 - 137. 2007.\n Github: https://github.com/frcs/colour-transfer \n\n\n\n
\n\n\n\n \n \n \"AutomatedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{Pitie_CVIU2007, \ntitle =  {Automated colour grading using colour distribution transfer}, \njournal =  {Computer Vision and Image Understanding},\nvolume =  {107}, \nnumber =  {1}, \npages =  {123 - 137},\nyear =  {2007}, \nnote =  {Special issue on color image processing},\nissn =  {1077-3142}, \ndoi =  {10.1016/j.cviu.2006.11.011}, \nnote = {Github: https://github.com/frcs/colour-transfer },\nabstract = {This article proposes an original method for grading the colours between different images or shots.\nThe first stage of the method is to find a one-to-one colour mapping that transfers the palette of an example target picture to the original picture.\nThis is performed using an original and parameter free algorithm that is able to transform any N-dimensional probability density function into another one.\nThe proposed algorithm is iterative, non-linear and has a low computational cost. Applying the colour mapping on the original picture allows reproducing\nthe same ‘feel’ as the target picture, but can also increase the graininess of the original picture, especially if the colour dynamic of the two pictures\nis very different. The second stage of the method is to reduce\nthis grain artefact through an efficient post-processing algorithm that intends to preserve the gradient field of the original picture.},\nurl = {https://mural.maynoothuniversity.ie/15125/1/RD_automated.pdf},\neprint =  {http://www.sciencedirect.com/science/article/pii/S1077314206002189},\nauthor =  {François Pitie and Anil C. Kokaram and Rozenn Dahyot}, \nkeywords =  {Colour grading, Colour transfer, Re-colouring, Distribution transfer}}\n\n
\n
\n\n\n
\n This article proposes an original method for grading the colours between different images or shots. The first stage of the method is to find a one-to-one colour mapping that transfers the palette of an example target picture to the original picture. This is performed using an original and parameter free algorithm that is able to transform any N-dimensional probability density function into another one. The proposed algorithm is iterative, non-linear and has a low computational cost. Applying the colour mapping on the original picture allows reproducing the same ‘feel’ as the target picture, but can also increase the graininess of the original picture, especially if the colour dynamic of the two pictures is very different. The second stage of the method is to reduce this grain artefact through an efficient post-processing algorithm that intends to preserve the gradient field of the original picture.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Statistical Hough Transform.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Technical Report TCD-CS-2007-37, School of Computer Science and Statistics, Trinity College Dublin Ireland, July 2007.\n \n\n\n\n
\n\n\n\n \n \n \"StatisticalPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{Dahyot07, \ntitle =  {Statistical Hough Transform}, \nauthor =  {R. Dahyot}, \nnumber =  {TCD-CS-2007-37}, \ninstitution =  {School of Computer Science and Statistics, Trinity College Dublin Ireland}, \nmonth =  {July}, \nyear =  {2007}, \nurl =  {https://www.cs.tcd.ie/publications/tech-reports/reports.07/TCD-CS-2007-37.pdf}}\n\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Optimal Mass Transport for Understanding and Synthesis of Visual Data.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Technical Report School of Computer Science and Statistics, Trinity College Dublin Ireland, 2007.\n First stage proposal TCD selection to PIYRA (not funded)\n\n\n\n
\n\n\n\n \n \n \"OptimalPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{DahyotPIYRA07, \ntitle =  {Optimal Mass Transport for Understanding and Synthesis of Visual Data}, \nauthor =  {R. Dahyot}, \ninstitution =  {School of Computer Science and Statistics, Trinity College Dublin Ireland}, \nyear =  {2007}, \nnote = {First stage proposal TCD selection to PIYRA (not funded)},\nurl =  {https://roznn.github.io/PDF/RzDPIYRA2007.pdf}}\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Restoration of colour channel misalignments in colonoscopy videos.\n \n \n \n \n\n\n \n Dahyot, R.; and Lacey, G.\n\n\n \n\n\n\n Technical Report TCD-CS-2007-27, School of Computer Science and Statistics, Trinity College Dublin Ireland, July 2007.\n URI: http://hdl.handle.net/2262/90913\n\n\n\n
\n\n\n\n \n \n \"RestorationPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@techreport{DahyotLacey07, \ntitle =  {Restoration of colour channel misalignments in colonoscopy videos}, \nauthor =  {R. Dahyot and G. Lacey}, \nnumber =  {TCD-CS-2007-27}, \ninstitution =  {School of Computer Science and Statistics, \nTrinity College Dublin Ireland}, \nmonth =  {July},\nyear =  {2007}, \nabstract = {We propose a method to restore colonoscopy videos that have low quality RGB images. The main problem concerns a time delay occurring in between the recordings of the R, G and B colour channels. As the camera is moving along in the colon, sometimes quickly, the resulting images show non properly matched R, G and B causing blurry effects that impede the medical doctors or computer-aided analysis methods. We proposed to restore this artefact by first equalizing the colour channels and then performing a robust camera motion estimation and compensation. Experimental results show significant improvements from the original videos.},\nnote = {URI: http://hdl.handle.net/2262/90913},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/90913/TCD-CS-2007-27.pdf}}\n
\n
\n\n\n
\n We propose a method to restore colonoscopy videos that have low quality RGB images. The main problem concerns a time delay occurring in between the recordings of the R, G and B colour channels. As the camera is moving along in the colon, sometimes quickly, the resulting images show non properly matched R, G and B causing blurry effects that impede the medical doctors or computer-aided analysis methods. We proposed to restore this artefact by first equalizing the colour channels and then performing a robust camera motion estimation and compensation. Experimental results show significant improvements from the original videos.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Visual enhancement using multiple audio streams in live music performance.\n \n \n \n \n\n\n \n Dahyot, R.; Kelly, C.; and Kearney, G.\n\n\n \n\n\n\n In 31st International Conference Audio Engineering Society , London, UK, June 2007. \n \n\n\n\n
\n\n\n\n \n \n \"VisualPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Kelly2007, \nauthor =  {R. Dahyot and C. Kelly and G. Kearney},\ntitle =  {Visual enhancement using multiple audio streams in live music performance}, \nabstract = {The use of multiple audio streams from digital mixing consoles is presented for application to real-time \nenhancement of synchronised visual effects in live music performances. The audio streams are processed simultaneously \nand their temporal and spectral characteristics can be used to control the intensity, duration and colour of the lights.\nThe efficiency of the approach is tested on rock and jazz pieces. \nThe result of the analysis is illustrated by a visual OpenGL 3-D animation illustrating the synchronous audio-visual events occurring in the musical piece.},\nbooktitle =  {31st International Conference Audio Engineering Society }, \neprint = {https://www.aes.org/e-lib/browse.cfm?elib = 13947},\nurl = {https://www.aes.org/e-lib/browse.cfm?elib = 13947},\nyear =  {2007}, \naddress =  {London, UK}, \nmonth =  {June}}\n\n
\n
\n\n\n
\n The use of multiple audio streams from digital mixing consoles is presented for application to real-time enhancement of synchronised visual effects in live music performances. The audio streams are processed simultaneously and their temporal and spectral characteristics can be used to control the intensity, duration and colour of the lights. The efficiency of the approach is tested on rock and jazz pieces. The result of the analysis is illustrated by a visual OpenGL 3-D animation illustrating the synchronous audio-visual events occurring in the musical piece.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2006\n \n \n (5)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Bayesian Inferences for Object Detection.\n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n In 21st International Workshop on Statistical Modelling, pages 127-130, Galway, Ireland, July 3-7 2006. \n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_IWSM2006,\ntitle =  {Bayesian Inferences for Object Detection}, \nauthor =  {R. Dahyot},\nbooktitle =  {21st International Workshop on Statistical Modelling}, \naddress =  {Galway, Ireland}, \nmonth =  {July 3-7}, \npages =  {127-130}, \nyear =  {2006}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Browsing sports video: trends in sports-related indexing and retrieval work.\n \n \n \n \n\n\n \n Kokaram, A.; Rea, N.; Dahyot, R.; Tekalp, M.; Bouthemy, P.; Gros, P.; and Sezan, I.\n\n\n \n\n\n\n IEEE Signal Processing Magazine, 23(2): 47-58. March 2006.\n URI: http://hdl.handle.net/2262/1998\n\n\n\n
\n\n\n\n \n \n \"BrowsingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@ARTICLE{Kokaram2006, \nauthor =  {A. Kokaram and N. Rea and R. Dahyot and M. Tekalp and P. Bouthemy and P. Gros and I. Sezan}, \njournal =  {IEEE Signal Processing Magazine}, \ntitle =  {Browsing sports video: trends in sports-related indexing and retrieval work}, \nabstract = {This paper aims to identify the current trends in sports-based indexing and retrieval work. \nIt discusses the essential building blocks for any semantic-level retrieval system and acts as a case study in content\nanalysis system design. While one of the major benefits of digital media and digital television in particular has been to provide \nusers with more choices and a more interactive viewing experience, the freedom to choose has in fact manifested as the freedom to choose \nfrom the options the broadcaster provides. It is only through the use of automated content-based analysis that sports \nviewers will be given a chance to manipulate content at a much deeper level than that intended by broadcasters, \nand hence put true meaning into interactivity},\nyear =  {2006}, \nvolume =  {23}, \nnumber =  {2}, \npages =  {47-58}, \nkeywords =  {content-based retrieval;indexing;sport;video retrieval;automated content-based analysis;content analysis system design;digital media;digital television;interactive viewing experience;retrieval work;semantic-level retrieval system;sports video;sports-based indexing;sports-related indexing;Broadcasting;Cameras;Educational institutions;Games;Image analysis;Indexing;Information retrieval;Multimedia communication;Packaging;Tagging}, \nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/1998/01621448.pdf},\nnote = {URI: http://hdl.handle.net/2262/1998},\ndoi =  {10.1109/MSP.2006.1621448}, \nISSN =  {1053-5888}, \nmonth =  {March}}\n\n
\n
\n\n\n
\n This paper aims to identify the current trends in sports-based indexing and retrieval work. It discusses the essential building blocks for any semantic-level retrieval system and acts as a case study in content analysis system design. While one of the major benefits of digital media and digital television in particular has been to provide users with more choices and a more interactive viewing experience, the freedom to choose has in fact manifested as the freedom to choose from the options the broadcaster provides. It is only through the use of automated content-based analysis that sports viewers will be given a chance to manipulate content at a much deeper level than that intended by broadcasters, and hence put true meaning into interactivity\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n Multimodal Periodicity Analysis for Illicit Content Detection in Videos.\n \n \n \n\n\n \n Rea, N.; Lambe, C.; Lacey, G.; and Dahyot, R.\n\n\n \n\n\n\n In The 3rd European Conference on Visual Media Production (CVMP 2006) - Part of the 2nd Multimedia Conference 2006, pages 106-114, Nov 2006. \n \n\n\n\n
\n\n\n\n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@inproceedings{Rea2006, \nauthor =  {N. Rea and C. Lambe and G. Lacey and R. Dahyot}, \nbooktitle =  {The 3rd European Conference on Visual Media Production (CVMP 2006) - Part of the 2nd Multimedia Conference 2006}, \ntitle =  {Multimodal Periodicity Analysis for Illicit Content Detection in Videos}, \nyear =  {2006}, \nvolume =  {}, \nnumber =  {}, \npages =  {106-114}, \nkeywords =  {}, \ndoi =  {10.1049/cp:20061978}, \neprint =  {https://ieeexplore.ieee.org/document/4156017/}, \nISSN =  {0537-9989}, \nmonth =  {Nov}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Robust Scale Estimation for the generalized Gaussian Probability Density Function.\n \n \n \n \n\n\n \n Dahyot, R.; and Wilson, S.\n\n\n \n\n\n\n Advances in Methodology and Statistics (Metodološki zvezki), 3(1): 21-37. 2006.\n Also at http://mrvar.fdv.uni-lj.si/pub/mz/mz3.1/dahyot.pdf\n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@ARTICLE{Dahyot_MZ2006, \ntitle =  {Robust Scale Estimation for the generalized Gaussian Probability Density Function},\nauthor =  {R. Dahyot and S. Wilson},\njournal =  {Advances in Methodology and Statistics (Metodolo\\v{s}ki zvezki)}, \nyear =  {2006},\npages =  {21-37}, \nnumber =  {1},\nabstract = {This article proposes a robust way to estimate the scale parameter of a generalised centered Gaussian mixture. The principle relies on the association of samples\nof this mixture to generate samples of a new variable that shows relevant distribution properties to estimate the unknown parameter. In fact, the distribution of this\nnew variable shows a maximum that is linked to this scale parameter. Using nonparametric modelling of the distribution and the MeanShift procedure, the relevant\npeak is identified and an estimate is computed. The whole procedure is fully automatic and does not require any prior settings. It is applied to regression problems,\nand digital data processing.},\nvolume =  {3}, \nnote = {Also at http://mrvar.fdv.uni-lj.si/pub/mz/mz3.1/dahyot.pdf},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/8718/dahyot.pdf}}\n\n
\n
\n\n\n
\n This article proposes a robust way to estimate the scale parameter of a generalised centered Gaussian mixture. The principle relies on the association of samples of this mixture to generate samples of a new variable that shows relevant distribution properties to estimate the unknown parameter. In fact, the distribution of this new variable shows a maximum that is linked to this scale parameter. Using nonparametric modelling of the distribution and the MeanShift procedure, the relevant peak is identified and an estimate is computed. The whole procedure is fully automatic and does not require any prior settings. It is applied to regression problems, and digital data processing.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Unsupervised Camera Motion Estimation and Moving Object Detection in Videos.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2006), Dublin, Ireland, 30 Aug.-1 Sept. 2006. \n \n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_IMVIP06, \ntitle =  {Unsupervised Camera Motion Estimation and Moving Object Detection in Videos}, \nauthor =  {R. Dahyot}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2006)}, \naddress =  {Dublin, Ireland},\nmonth =  {30 Aug.-1 Sept.}, \nyear =  {2006}, \nabstract = {In this article, we consider the robust estimation of a location parameter using Mestimators. We propose here to couple this estimation with the robust scale estimate proposed in [Dahyot and Wilson, 2006]. The resulting procedure is then completely unsupervised. It is applied to camera motion estimation and moving object detection in videos.\nExperimental results on different video materials show the adaptability and the accuracy\nof this new robust approach.},\nurl =  {http://www.tara.tcd.ie/bitstream/handle/2262/2058/RzDimvip06.pdf}}
\n
\n\n\n
\n In this article, we consider the robust estimation of a location parameter using Mestimators. We propose here to couple this estimation with the robust scale estimate proposed in [Dahyot and Wilson, 2006]. The resulting procedure is then completely unsupervised. It is applied to camera motion estimation and moving object detection in videos. Experimental results on different video materials show the adaptability and the accuracy of this new robust approach.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2005\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Proceedings of the Workshop on Machine Learning Techniques for Processing Multimedia Content.\n \n \n \n \n\n\n \n Cord, M.; Cunningham, P.; Dahyot, R.; and Sziranyi, T.,\n editors.\n \n\n\n \n\n\n\n Bonn, Germany, August 2005.\n URI: http://hdl.handle.net/2262/52985\n\n\n\n
\n\n\n\n \n \n \"ProceedingsPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@proceedings{Cord2005,\neditor  = {Matthieu Cord and Padraig Cunningham and Rozenn Dahyot and Tamas Sziranyi},\ntitle = {Proceedings of the Workshop on Machine Learning Techniques for Processing Multimedia Content},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/52985/Workshop2005.pdf},\nabstract = {Machine Learning (ML) techniques are used in situations where data is available in electronic format and ML algorithms can add value by analysing this data. This is the situation with the processing of multimedia content. The added value from ML can take a number of forms:  \nby providing insight into the domain from which the data is drawn,  by improving the performance of another process that is manipulating the data, \nby organising the data in some way or  by helping to interpret multimedia content to make it more understandable. \nThis potential for ML to add value in processing of multimedia content has made this one of the most popular application areas for ML research. Multimedia content has some characteristics that place specific demands on ML. The data is typically of very high dimension and dimension reduction is often required. The normal distinction between supervised and unsupervised techniques doesnt always apply; it is often the case that only some of the data is labeled or the user may assist in labeling the data during processing. Typically the ML process is preceded by a feature extraction stage and the success of the ML stage will often depend on the feature extraction. This workshop on Machine Learning Techniques for Processing Multimedia Content has been organized because of these special issues that arise with multimedia data. We have papers describing applications in image processing, video analysis and music classification. The research described in these papers has drawn on a wide range of ML techniques. It is hoped that this workshop will help identify important research directions for Machine Learning that will help in the processing of multimedia content.},\nnote = {URI: http://hdl.handle.net/2262/52985},\naddress = {Bonn, Germany},\nmonth = {August},\nyear = {2005}}\n
\n
\n\n\n
\n Machine Learning (ML) techniques are used in situations where data is available in electronic format and ML algorithms can add value by analysing this data. This is the situation with the processing of multimedia content. The added value from ML can take a number of forms: by providing insight into the domain from which the data is drawn, by improving the performance of another process that is manipulating the data, by organising the data in some way or by helping to interpret multimedia content to make it more understandable. This potential for ML to add value in processing of multimedia content has made this one of the most popular application areas for ML research. Multimedia content has some characteristics that place specific demands on ML. The data is typically of very high dimension and dimension reduction is often required. The normal distinction between supervised and unsupervised techniques doesnt always apply; it is often the case that only some of the data is labeled or the user may assist in labeling the data during processing. Typically the ML process is preceded by a feature extraction stage and the success of the ML stage will often depend on the feature extraction. This workshop on Machine Learning Techniques for Processing Multimedia Content has been organized because of these special issues that arise with multimedia data. We have papers describing applications in image processing, video analysis and music classification. The research described in these papers has drawn on a wide range of ML techniques. It is hoped that this workshop will help identify important research directions for Machine Learning that will help in the processing of multimedia content.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Classification and representation of semantic content in broadcast tennis videos.\n \n \n \n \n\n\n \n Rea, N.; Dahyot, R.; and Kokaram, A.\n\n\n \n\n\n\n In IEEE International Conference on Image Processing 2005, volume 3, pages III-1204-7, Sept 2005. \n URI: http://hdl.handle.net/2262/19779\n\n\n\n
\n\n\n\n \n \n \"ClassificationPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{ReaICIP05, \nauthor =  {N. Rea and R. Dahyot and A. Kokaram}, \nbooktitle =  {IEEE International Conference on Image Processing 2005}, \ntitle =  {Classification and representation of semantic content in broadcast tennis videos}, \nyear =  {2005},\nvolume =  {3}, \nnumber =  {}, \npages =  {III-1204-7}, \nkeywords =  {image classification;image colour analysis;image representation;image sequences;particle filtering (numerical methods);video signal processing;broadcast tennis videos;particle filter;semantic analysis;spatio-temporal behaviour;video classification;video representation;video sequence;Content based retrieval;Educational institutions;Hidden Markov models;Histograms;Multimedia communication;Particle filters;Particle tracking;Streaming media;TV broadcasting;Videos}, \nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/19779/01530614.pdf},\nnote = {URI: http://hdl.handle.net/2262/19779},\ndoi = {10.1109/ICIP.2005.1530614}, \nabstract = {This paper investigates the semantic analysis of broadcast tennis footage. We consider the spatio-temporal behaviour of an object in the footage as being the embodiment of a semantic event. This object is tracked using a colour based particle filter. The video syntax and audio features are used to help delineate the temporal boundaries of these events. For broadcast tennis footage, the system firstly parses the video sequence based on the geometry of the content in view and classifies the clip as a particular view type. The temporal behaviour of the serving player is modelled using a HMM. As a result, each model is representative of a particular semantic episode. Events are then summarised using a number of synthesised keyframes.},\nISSN =  {1522-4880},\nmonth =  {Sept}}\n\n
\n
\n\n\n
\n This paper investigates the semantic analysis of broadcast tennis footage. We consider the spatio-temporal behaviour of an object in the footage as being the embodiment of a semantic event. This object is tracked using a colour based particle filter. The video syntax and audio features are used to help delineate the temporal boundaries of these events. For broadcast tennis footage, the system firstly parses the video sequence based on the geometry of the content in view and classifies the clip as a particular view type. The temporal behaviour of the serving player is modelled using a HMM. As a result, each model is representative of a particular semantic episode. Events are then summarised using a number of synthesised keyframes.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Content Controlled Image Representation for Sports Streaming.\n \n \n \n \n\n\n \n Kokaram, A.; Pitie, F.; Dahyot, R.; Rea, N.; and Yeterian, S.\n\n\n \n\n\n\n In proceedings of the IEEE workshop on Content Based Multimedia Indexing (CBMI'05), Riga, Latvia, June 2005. \n URI: http://hdl.handle.net/2262/24739\n\n\n\n
\n\n\n\n \n \n \"ContentPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{KokaramCBMI05, \ntitle =  {Content Controlled Image Representation for Sports Streaming},\nabstract = {Content based analysis has traditionally been posed in the\ncontext of identifying some material in response to a user\nquery. This paper illustrates that given a content based analysis process that can identify semantic events in a sequence,\nthat sequence can then be changed in various ways. A Motion Keyframe is presented to re-express the viewing of a sequence. The notion of content analysis for control of other\nmedia processing engines is introduced. Tennis footage is\nused to illustrate the ideas since sports in general contains\nstrong contextual information.},\nauthor =  {A. Kokaram and F. Pitie and R. Dahyot and N. Rea and S. Yeterian}, \nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/24739/cbmi05.pdf},\nnote = {URI: http://hdl.handle.net/2262/24739},\nbooktitle =  {proceedings of the IEEE workshop on Content Based Multimedia Indexing (CBMI'05)}, \naddress =  {Riga, Latvia},\nmonth =  {June}, \nyear =  {2005}}\n\n
\n
\n\n\n
\n Content based analysis has traditionally been posed in the context of identifying some material in response to a user query. This paper illustrates that given a content based analysis process that can identify semantic events in a sequence, that sequence can then be changed in various ways. A Motion Keyframe is presented to re-express the viewing of a sequence. The notion of content analysis for control of other media processing engines is introduced. Tennis footage is used to illustrate the ideas since sports in general contains strong contextual information.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Exploiting Temporal Discontinuities for Event Detection and Manipulation in Video Streams.\n \n \n \n \n\n\n \n Denman, H.; Doyle, E.; Kokaram, A.; Lennon, D.; Dahyot, R.; and Fuller, R.\n\n\n \n\n\n\n In Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, of MIR 05, pages 183-192, New York, NY, USA, 2005. Association for Computing Machinery\n \n\n\n\n
\n\n\n\n \n \n \"ExploitingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{10.1145/1101826.1101857, \nauthor =  {Denman, Hugh and Doyle, Erika and Kokaram, Anil and Lennon, Daire and Dahyot, Rozenn and Fuller, Ray}, \ntitle =  {Exploiting Temporal Discontinuities for Event Detection and Manipulation in Video Streams}, \nyear =  {2005},\nisbn =  {1595932445},\nabstract = {Discontinuities in any information bearing signal serve to represent much of the vital or interesting content in that signal. A sharp loud noise in a movie could be a gun, or something breaking. In sports like tennis, cricket or snooker/pool it would indicate a point scoring event. In both cases the discontinuity is likely to be semantically relevant without further inference being necessary, once a particular domain is adopted. This paper discusses the importance of temporal motion discontinuities in inferring events in visual media. Two particular application domains are considered: content based audio/video synchronisation and event spotting in observational Psychology.},\npublisher =  {Association for Computing Machinery}, \naddress =  {New York, NY, USA}, \nurl =  {https://mural.maynoothuniversity.ie/15289/1/RD_exploiting.pdf}, \ndoi =  {10.1145/1101826.1101857}, \nbooktitle =  {Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval}, \npages =  {183-192}, \nnumpages =  {10}, \nkeywords =  {event spotting, video retrieval, motion tracking, information retrieval, bayesian inference}, location =  {Hilton, Singapore}, series =  {MIR 05}}\n\n
\n
\n\n\n
\n Discontinuities in any information bearing signal serve to represent much of the vital or interesting content in that signal. A sharp loud noise in a movie could be a gun, or something breaking. In sports like tennis, cricket or snooker/pool it would indicate a point scoring event. In both cases the discontinuity is likely to be semantically relevant without further inference being necessary, once a particular domain is adopted. This paper discusses the importance of temporal motion discontinuities in inferring events in visual media. Two particular application domains are considered: content based audio/video synchronisation and event spotting in observational Psychology.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n N-dimensional probability density function transfer and its application to color transfer.\n \n \n \n \n\n\n \n Pitie, F.; Kokaram, A. C.; and Dahyot, R.\n\n\n \n\n\n\n In Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, volume 2, pages 1434-1439, Oct 2005. \n URI: http://hdl.handle.net/2262/19800 - Github: https://github.com/frcs/colour-transfer\n\n\n\n
\n\n\n\n \n \n \"N-dimensionalPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@Inproceedings{PitieICCV2005,\nauthor =  {F. Pitie and A. C. Kokaram and R. Dahyot}, \nbooktitle =  {Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1}, \ntitle =  {N-dimensional probability density function transfer and its application to color transfer}, \nyear =  {2005}, \nvolume =  {2}, \nnumber =  {},\npages =  {1434-1439},\nkeywords =  {image colour analysis;probability;1D marginal distribution;automated color grading;color transfer;continuous transformation;probability density function;Color;Computational efficiency;Density functional theory;Distributed computing;Educational institutions;Image converters;Iterative methods;Rendering (computer graphics);Statistical distributions;Statistics},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/19800/01544887.pdf},\nnote = {URI: http://hdl.handle.net/2262/19800 -  Github: https://github.com/frcs/colour-transfer},\nabstract = {This article proposes an original method to estimate a\ncontinuous transformation that maps a N-dimensional distribution\nto another. The method is iterative, non-linear, and\nis shown to converge. Only 1D marginal distributions are\nused in the estimation process, hence involving low computation\ncosts. As an illustration this mapping is applied\nto colour transfer between two images of different contents.\nThe paper also serves as a central focal point for collecting\ntogether the research activity in this area and relating it to\nthe important problem of Automated Colour Grading.},\ndoi =  {10.1109/ICCV.2005.166}, \nISSN =  {1550-5499},\nmonth =  {Oct}}\n\n
\n
\n\n\n
\n This article proposes an original method to estimate a continuous transformation that maps a N-dimensional distribution to another. The method is iterative, non-linear, and is shown to converge. Only 1D marginal distributions are used in the estimation process, hence involving low computation costs. As an illustration this mapping is applied to colour transfer between two images of different contents. The paper also serves as a central focal point for collecting together the research activity in this area and relating it to the important problem of Automated Colour Grading.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Off-line multiple object tracking using candidate selection and the Viterbi algorithm.\n \n \n \n \n\n\n \n Pitie, F.; Berrani, S. A.; Kokaram, A.; and Dahyot, R.\n\n\n \n\n\n\n In IEEE International Conference on Image Processing 2005, volume 3, pages III-109-12, Sept 2005. \n URI: http://hdl.handle.net/2262/19821\n\n\n\n
\n\n\n\n \n \n \"Off-linePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{PitieICIP05, \nauthor =  {F. Pitie  and S. A. Berrani and A. Kokaram and R. Dahyot}, \nbooktitle =  {IEEE International Conference on Image Processing 2005}, \ntitle =  {Off-line multiple object tracking using candidate selection and the Viterbi algorithm}, \nyear =  {2005}, \nvolume =  {3}, \nnumber =  {}, \npages =  {III-109-12}, \nkeywords =  {maximum likelihood estimation;object detection;particle filtering (numerical methods);Viterbi algorithm;candidate selection;deterministic solution;off-line multiple object tracking;particle filter methods;probabilistic framework;Data mining;Feature extraction;Image sequences;Indexing;Information retrieval;Particle filters;Particle tracking;Performance analysis;Surveillance;Viterbi algorithm},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/19821/01530340.pdf},\nnote = {URI: http://hdl.handle.net/2262/19821},\nabstract = {This paper presents a probabilistic framework for off-line\nmultiple object tracking. At each timestep, a small set of\ndeterministic candidates is generated which is guaranteed\nto contain the correct solution. Tracking an object within\nvideo then becomes possible using the Viterbi algorithm. In\ncontrast with particle filter methods where candidates are\nnumerous and random, the proposed algorithm involves a\nfew candidates and results in a deterministic solution. Moreover, we consider here off-line applications where past and\nfuture information is exploited. This paper shows that, although basic and very simple, this candidate selection allows the solution of many tracking problems in different\nreal-world applications and offers a good alternative to particle filter methods for off-line applications.},\ndoi =  {10.1109/ICIP.2005.1530340}, \nISSN =  {1522-4880}, \nmonth =  {Sept}}\n
\n
\n\n\n
\n This paper presents a probabilistic framework for off-line multiple object tracking. At each timestep, a small set of deterministic candidates is generated which is guaranteed to contain the correct solution. Tracking an object within video then becomes possible using the Viterbi algorithm. In contrast with particle filter methods where candidates are numerous and random, the proposed algorithm involves a few candidates and results in a deterministic solution. Moreover, we consider here off-line applications where past and future information is exploited. This paper shows that, although basic and very simple, this candidate selection allows the solution of many tracking problems in different real-world applications and offers a good alternative to particle filter methods for off-line applications.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2004\n \n \n (7)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n A Bayesian approach to object detection using probabilistic appearance-based models.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n Pattern Analysis and Applications, 7(3): 317–332. Dec 2004.\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@ARTICLE{Dahyot_PAA, \nauthor =  {Dahyot, Rozenn\nand Charbonnier, Pierre\nand Heitz, Fabrice}, \ntitle =  {A Bayesian approach to object detection using probabilistic appearance-based models},\njournal =  {Pattern Analysis and Applications},\nyear =  {2004}, \nmonth =  {Dec}, \nday =  {01}, \nvolume =  {7}, \nnumber =  {3}, \npages =  {317--332}, \nabstract =  {In this paper, we introduce a Bayesian approach, inspired by \nprobabilistic principal component analysis (PPCA) (Tipping and Bishop in J Royal Stat Soc Ser B 61(3):611--622, 1999),\n to detect objects in complex scenes using appearance-based models. The originality of the proposed framework is to explicitly \n take into account general forms of the underlying distributions, both for the in-eigenspace distribution and for the observation model. \n The approach combines linear data reduction techniques (to preserve computational efficiency), non-linear constraints on the in-eigenspace \n distribution (to model complex variabilities) and non-linear (robust) observation models (to cope with clutter, outliers and occlusions). \n The resulting statistical representation generalises most existing PCA-based models\n  (Tipping and Bishop in J Royal Stat Soc Ser B 61(3):611--622, 1999; Black and Jepson in Int J Comput Vis 26(1):63--84, 1998; Moghaddam and Pentland in IEEE Trans Pattern Anal Machine Intell 19(7):696--710, 1997) and leads to the definition of a new family of non-linear probabilistic detectors. The performance of the approach is assessed using receiver operating characteristic (ROC) analysis on several representative databases, showing a major improvement in detection performances with respect to the standard methods that have been the references up to now.}, \nissn =  {1433-755X},\ndoi =  {10.1007/s10044-004-0230-5}, \nurl =  {https://mural.maynoothuniversity.ie/15128/1/RD_a%20bayesian.pdf\n}}\n\n
\n
\n\n\n
\n In this paper, we introduce a Bayesian approach, inspired by probabilistic principal component analysis (PPCA) (Tipping and Bishop in J Royal Stat Soc Ser B 61(3):611–622, 1999), to detect objects in complex scenes using appearance-based models. The originality of the proposed framework is to explicitly take into account general forms of the underlying distributions, both for the in-eigenspace distribution and for the observation model. The approach combines linear data reduction techniques (to preserve computational efficiency), non-linear constraints on the in-eigenspace distribution (to model complex variabilities) and non-linear (robust) observation models (to cope with clutter, outliers and occlusions). The resulting statistical representation generalises most existing PCA-based models (Tipping and Bishop in J Royal Stat Soc Ser B 61(3):611–622, 1999; Black and Jepson in Int J Comput Vis 26(1):63–84, 1998; Moghaddam and Pentland in IEEE Trans Pattern Anal Machine Intell 19(7):696–710, 1997) and leads to the definition of a new family of non-linear probabilistic detectors. The performance of the approach is assessed using receiver operating characteristic (ROC) analysis on several representative databases, showing a major improvement in detection performances with respect to the standard methods that have been the references up to now.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n A New Robust Technique for Stabilizing Brightness Fluctuations in Image Sequences.\n \n \n \n \n\n\n \n Pitie, F.; Dahyot, R.; Kelly, F.; and Kokaram, A.\n\n\n \n\n\n\n In Comaniciu, D.; Mester, R.; Kanatani, K.; and Suter, D., editor(s), Statistical Methods in Video Processing, pages 153–164, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg\n \n\n\n\n
\n\n\n\n \n \n \"APaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Pitiesmvp2004,\nauthor =  {Pitie, Fran{\\c{c}}ois and Dahyot, Rozenn and Kelly, Francis and Kokaram, Anil}, \neditor =  {Comaniciu, Dorin\nand Mester, Rudolf\nand Kanatani, Kenichi\nand Suter, David}, \ntitle =  {A New Robust Technique for Stabilizing Brightness Fluctuations in Image Sequences}, \nbooktitle =  {Statistical Methods in Video Processing},\nyear =  {2004}, \npublisher =  {Springer Berlin Heidelberg}, \naddress =  {Berlin, Heidelberg},\npages =  {153--164},\nabstract =  {Temporal random variation of luminance in images can manifest in film and video due to a wide variety of sources. Typical in archived films, it also affects scenes recorded simultaneously with different cameras (e.g. for film special effect), and scenes affected by illumination problems. Many applications in Computer Vision and Image Processing that try to match images (e.g. for motion estimation, stereo vision, etc.) have to cope with this problem. The success of current techniques for dealing with this is limited by the non-linearity of severe distortion, the presence of motion and missing data (yielding outliers in the estimation process) and the lack of fast implementations in reconfigurable systems. This paper proposes a new process for stabilizing brightness fluctuations that improves the existing models. The article also introduces a new estimation method able to cope with outliers in the joint distribution of pairs images. The system implementation is based on the novel use of general purpose PC graphics hardware. The overall system presented here is able to deal with much more severe distortion than previously was the case, and in addition can operate at 7 fps on a 1.6GHz PC with broadcast standard definition images.},\nisbn =  {978-3-540-30212-4},\nurl={https://link.springer.com/content/pdf/10.1007/978-3-540-30212-4_14.pdf}, \ndoi =  {10.1007/978-3-540-30212-4{\\_}14}}\n\n
\n
\n\n\n
\n Temporal random variation of luminance in images can manifest in film and video due to a wide variety of sources. Typical in archived films, it also affects scenes recorded simultaneously with different cameras (e.g. for film special effect), and scenes affected by illumination problems. Many applications in Computer Vision and Image Processing that try to match images (e.g. for motion estimation, stereo vision, etc.) have to cope with this problem. The success of current techniques for dealing with this is limited by the non-linearity of severe distortion, the presence of motion and missing data (yielding outliers in the estimation process) and the lack of fast implementations in reconfigurable systems. This paper proposes a new process for stabilizing brightness fluctuations that improves the existing models. The article also introduces a new estimation method able to cope with outliers in the joint distribution of pairs images. The system implementation is based on the novel use of general purpose PC graphics hardware. The overall system presented here is able to deal with much more severe distortion than previously was the case, and in addition can operate at 7 fps on a 1.6GHz PC with broadcast standard definition images.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Comparison of Two Algorithms for Robust M-estimation of Global Motion Parameters .\n \n \n \n \n\n\n \n Dahyot, R.; and Kokaram, A.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2004), pages 224-231, Dublin, Ireland, September 2004. \n \n\n\n\n
\n\n\n\n \n \n \"ComparisonPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_IMVIP04,\nauthor =  {R. Dahyot and A. Kokaram}, \ntitle =  {Comparison of Two Algorithms for Robust M-estimation of Global Motion Parameters },\nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2004)}, \nmonth =  {September}, \nabstract = {The estimation of Global or Camera motion from image sequences is important both for video\nretrieval and compression (MPEG4). This is frequently performed using robust M-estimators with\nthe widely used Iterative Reweighted Least Squares algorithm. This article presents an investigation\nof the use of an alternative robust estimation algorithm and illustrates its improved computationnal\nefficiency. The paper also introduces two new confidence measures which can be used to validate\ncamera motion measurements in the context of information retrieval.},\nkeywords = {Camera motion, M-estimators, Video analysis},\nyear =  {2004}, \nurl = {https://mural.maynoothuniversity.ie/15315/1/RD_comparison.pdf},\npages =  {224-231},\naddress =  {Dublin, Ireland}}\n\n
\n
\n\n\n
\n The estimation of Global or Camera motion from image sequences is important both for video retrieval and compression (MPEG4). This is frequently performed using robust M-estimators with the widely used Iterative Reweighted Least Squares algorithm. This article presents an investigation of the use of an alternative robust estimation algorithm and illustrates its improved computationnal efficiency. The paper also introduces two new confidence measures which can be used to validate camera motion measurements in the context of information retrieval.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Inlier modeling for multimedia data analysis.\n \n \n \n \n\n\n \n Dahyot, R.; Rea, N.; Kokaram, A.; and Kingsbury, N.\n\n\n \n\n\n\n In IEEE 6th Workshop on Multimedia Signal Processing, 2004., pages 482-485, Sept 2004. \n URI: http://hdl.handle.net/2262/19839 \n\n\n\n
\n\n\n\n \n \n \"InlierPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{DahyotMMSP04, \nauthor =  {R. Dahyot and N. Rea and A. Kokaram and N. Kingsbury}, \nbooktitle =  {IEEE 6th Workshop on Multimedia Signal Processing, 2004.},\ntitle =  {Inlier modeling for multimedia data analysis}, \nyear =  {2004},\nvolume =  {}, \nnumber =  {}, \npages =  {482-485}, \nkeywords =  {audio signal processing;multimedia communication;normal distribution;audio data segmentation;centred normal distribution;colour class parameter extraction;multimedia data analysis;signal processing;Data analysis;Distributed computing;Educational institutions;Gaussian distribution;Parameter estimation;Parameter extraction;Random variables;Robustness;Signal processing;Statistical distributions}, \nabstract = {This paper presents a robust method to estimate the unknown standard deviation of a centred normal distribution from a mixture density. This method is applied to different signal processing problems. The first one concerns silence segmentation from audio data. The second application deals with colour class parameter extraction. In this later case, the mean is also estimated from the observations.},\ndoi =  {10.1109/MMSP.2004.1436600}, \nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/19839/01436600.pdf},\nnote = {URI: http://hdl.handle.net/2262/19839 },\nISSN =  {},\nmonth =  {Sept}}\n\n
\n
\n\n\n
\n This paper presents a robust method to estimate the unknown standard deviation of a centred normal distribution from a mixture density. This method is applied to different signal processing problems. The first one concerns silence segmentation from audio data. The second application deals with colour class parameter extraction. In this later case, the mean is also estimated from the observations.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Modeling high level structure in sports with motion driven HMMs.\n \n \n \n \n\n\n \n Rea, N.; Dahyot, R.; and Kokaram, A.\n\n\n \n\n\n\n In 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 3, pages iii-621-4 vol.3, May 2004. \n URI: http://hdl.handle.net/2262/24562\n\n\n\n
\n\n\n\n \n \n \"ModelingPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{Rea_ICASSP04, \nauthor =  {N. Rea and R. Dahyot and A. Kokaram}, \nbooktitle =  {2004 IEEE International Conference on Acoustics, Speech, and Signal Processing}, \ntitle =  {Modeling high level structure in sports with motion driven HMMs}, \nyear =  {2004}, \nvolume =  {3},\nnumber =  {}, \npages =  {iii-621-4 vol.3}, \nkeywords =  {feature extraction;hidden Markov models;image recognition;image retrieval;motion estimation;sport;video signal processing;broadcast sports footage;collision detection;colour based particle filter;dynamic events retrieval;feature extraction;game semantics;hidden Markov model;motion driven HMM;motion extraction;object position evolution modeling;semantic event recognition;snooker ball tracking;snooker table white ball position;sports high level structure modeling;view recognition;view type classification;Broadcasting;Cameras;Educational institutions;Games;Geometry;Hidden Markov models;Interleaved codes;Particle filters;Particle tracking;Video sequences}, \ndoi =  {10.1109/ICASSP.2004.1326621}, \nnote = {URI: http://hdl.handle.net/2262/24562},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/24562/01326621.pdf},\nabstract = {In this paper, we investigate the retrieval of dynamic events that occur in broadcast sports footage. Dynamic events in sports are important in so far as they are related to the game semantics. Thus far, the temporal interleaving of camera views has been used to infer these types of events. We propose the use of the spatio-temporal behaviour of an object in the footage as an embodiment of a semantic event. This is accomplished by modeling the evolution of the position of the object with a hidden Markov model (HMM). Snooker is used as an example for the purpose of this research. The system firstly parses the video sequence based on the geometry of the content in the camera view and classifies the footage as a particular view type. Secondly, we consider the relative position of the white ball on the snooker table over the duration of a clip to embody semantic events. A colour based particle filter is employed to robustly track the snooker balls. The temporal behaviour of the white ball is modeled using a HMM where each model is representative of a particular semantic episode. Upon collision of the white ball with another coloured ball, a separate track is instantiated.},\nISSN =  {1520-6149}, \nmonth =  {May}}\n\n
\n
\n\n\n
\n In this paper, we investigate the retrieval of dynamic events that occur in broadcast sports footage. Dynamic events in sports are important in so far as they are related to the game semantics. Thus far, the temporal interleaving of camera views has been used to infer these types of events. We propose the use of the spatio-temporal behaviour of an object in the footage as an embodiment of a semantic event. This is accomplished by modeling the evolution of the position of the object with a hidden Markov model (HMM). Snooker is used as an example for the purpose of this research. The system firstly parses the video sequence based on the geometry of the content in the camera view and classifies the footage as a particular view type. Secondly, we consider the relative position of the white ball on the snooker table over the duration of a clip to embody semantic events. A colour based particle filter is employed to robustly track the snooker balls. The temporal behaviour of the white ball is modeled using a HMM where each model is representative of a particular semantic episode. Upon collision of the white ball with another coloured ball, a separate track is instantiated.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Oriented Particle Spray: A New Probabilistic Contour Tracing with Directional Information.\n \n \n \n \n\n\n \n Pitie, F.; Kokaram, A.; and Dahyot, R.\n\n\n \n\n\n\n In Irish Machine Vision and Image Processing conference (IMVIP 2004), pages 158-165, Dublin, Ireland, September 2004. \n \n\n\n\n
\n\n\n\n \n \n \"OrientedPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{PITIE_IMVIP04, \nauthor =  {F. Pitie and A. Kokaram and R. Dahyot }, \ntitle =  {Oriented Particle Spray: A New Probabilistic Contour Tracing with Directional Information}, \nbooktitle =  {Irish Machine Vision and Image Processing conference (IMVIP 2004)}, \nmonth =  {September}, \nyear =  {2004}, \npages =  {158-165},\nurl =  {http://iprcs.org/pdf/IMVIP2004_Proceedings.pdf}, \naddress =  {Dublin, Ireland}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Semantic Event Detection in Sports Through Motion Understanding.\n \n \n \n \n\n\n \n Rea, N.; Dahyot, R.; and Kokaram, A.\n\n\n \n\n\n\n In Enser, P.; Kompatsiaris, Y.; O'Connor, N. E.; Smeaton, A. F.; and Smeulders, A. W. M., editor(s), Image and Video Retrieval, pages 88–97, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg\n \n\n\n\n
\n\n\n\n \n \n \"SemanticPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@InProceedings{ReaCIVR04, \nauthor =  {Rea, N.\nand Dahyot, R.\nand Kokaram, A.}, \neditor =  {Enser, Peter\nand Kompatsiaris, Yiannis\nand O'Connor, Noel E.\nand Smeaton, Alan F.\nand Smeulders, Arnold W. M.}, \ntitle =  {Semantic Event Detection in Sports Through Motion Understanding},\nbooktitle =  {Image and Video Retrieval},\nyear =  {2004}, \npublisher =  {Springer Berlin Heidelberg}, \naddress =  {Berlin, Heidelberg},\npages =  {88--97}, \nabstract =  {In this paper we investigate the retrieval of semantic events that occur in broadcast sports footage. We do so by considering the spatio-temporal behaviour of an object in the footage as being the embodiment of a particular semantic event. Broadcast snooker footage is used as an example of the sports footage for the purpose of this research. The system parses the sports video using the geometry of the content in view and classifies the footage as a particular view type. A colour based particle filter is then employed to robustly track the snooker balls, in the appropriate view, to evoke the semantics of the event. Over the duration of a player shot, the position of the white ball on the snooker table is used to model the high level semantic structure occurring in the footage. Upon collision of the white ball with another coloured ball, a separate track is instantiated allowing for the detection of pots and fouls, providing additional clues to the event in progress.}, \nisbn =  {978-3-540-27814-6},\nurl={https://link.springer.com/content/pdf/10.1007/978-3-540-27814-6_14.pdf},\ndoi =  {10.1007/978-3-540-27814-6{\\_}14}}\n\n
\n
\n\n\n
\n In this paper we investigate the retrieval of semantic events that occur in broadcast sports footage. We do so by considering the spatio-temporal behaviour of an object in the footage as being the embodiment of a particular semantic event. Broadcast snooker footage is used as an example of the sports footage for the purpose of this research. The system parses the sports video using the geometry of the content in view and classifies the footage as a particular view type. A colour based particle filter is then employed to robustly track the snooker balls, in the appropriate view, to evoke the semantics of the event. Over the duration of a player shot, the position of the white ball on the snooker table is used to model the high level semantic structure occurring in the footage. Upon collision of the white ball with another coloured ball, a separate track is instantiated allowing for the detection of pots and fouls, providing additional clues to the event in progress.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2003\n \n \n (6)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Analyse d'images séquentielles de scènes routières par modèles d'apparence pour la gestion du réseau routier.\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n of Etudes et Recherches des Laboratoires des Ponts et ChausséesParis : Laboratoire Central des Ponts et Chaussées (LCPC) 2-7208-2028-1, France, September 2003.\n (published in french)\n\n\n\n
\n\n\n\n \n \n \"AnalysePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@BOOK{B-Dahyot03, \nauthor =  {Rozenn Dahyot}, \ntitle =  {Analyse d'images s\\'{e}quentielles de sc\\`{e}nes routi\\`{e}res par mod\\`{e}les d'apparence pour la gestion du r\\'{e}seau routier},\npublisher =  {Paris : Laboratoire Central des Ponts et Chaussées (LCPC) 2-7208-2028-1}, \nseries =  {Etudes et Recherches des Laboratoires des Ponts et Chaussées}, \naddress =  {France}, \nyear =  {2003}, \nmonth =  {September}, \nurl={https://roznn.github.io/PDF/mem_dahyot.pdf},\nnote =  {(published in french)}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Detection robuste par modele probabiliste d apparence : une approche bayesienne .\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n Traitement du Signal, 20(2): 101-117. 2003.\n HANDLE: http://hdl.handle.net/2042/2221\n\n\n\n
\n\n\n\n \n \n \"DetectionPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@article{TS2003,\ntitle =  {Detection robuste par modele probabiliste d apparence : une approche bayesienne }, \nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz}, \njournal =  {Traitement du Signal}, \nvolume =  {20},\nabstract = {In this paper, methods are proposed to detect objects in complex scenes using statistical global appearance based models. In our approach, the standard eigenspace representation of a training image database and a priori non- Gaussian hypotheses are brought together in a Bayesian framework. This work unifies standard (appearancebased) detection methods already proposed in the literature and leads naturally to the definition of a new family of probabilistic detectors. It allows the use of more general a priori assumptions about the distribution on the eigenspace and its orthogonal. Experimental results are illustrated with ROC (Receiver Operating Characteristic) curves and show the major improvement of our Bayesian approach in comparison to the standard methods that have been the reference up to now [2, 14].},\nnumber =  {2}, \npages =  {101-117}, \nyear =  {2003}, \nnote = {HANDLE: http://hdl.handle.net/2042/2221},\nurl =  {http://documents.irevues.inist.fr/bitstream/handle/2042/2221/Charbonnier.pdf}}\n\n
\n
\n\n\n
\n In this paper, methods are proposed to detect objects in complex scenes using statistical global appearance based models. In our approach, the standard eigenspace representation of a training image database and a priori non- Gaussian hypotheses are brought together in a Bayesian framework. This work unifies standard (appearancebased) detection methods already proposed in the literature and leads naturally to the definition of a new family of probabilistic detectors. It allows the use of more general a priori assumptions about the distribution on the eigenspace and its orthogonal. Experimental results are illustrated with ROC (Receiver Operating Characteristic) curves and show the major improvement of our Bayesian approach in comparison to the standard methods that have been the reference up to now [2, 14].\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Joint audio visual retrieval for tennis broadcasts.\n \n \n \n \n\n\n \n Dahyot, R.; Kokaram, A.; Rea, N.; and Denman, H.\n\n\n \n\n\n\n In Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on, volume 3, pages III-561-4 vol.3, April 2003. \n URI: http://hdl.handle.net/2262/81765\n\n\n\n
\n\n\n\n \n \n \"JointPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{DahyotICASSP03, \nauthor =  {R. Dahyot and A. Kokaram and N. Rea and H. Denman},\nbooktitle =  {Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on}, \ntitle =  {Joint audio visual retrieval for tennis broadcasts}, \nyear =  {2003}, \nvolume =  {3}, \nnumber =  {}, \npages =  {III-561-4 vol.3}, \nkeywords =  {audio coding;content-based retrieval;feature extraction;image retrieval;maximum likelihood estimation;principal component analysis;sport;stochastic processes;video coding;PCA;audio features;content retrieval;image features;image moments;joint audio visual retrieval;key episode identification;likelihood approach;scene geometry;sports;stochastic processes;tennis broadcasts;Broadcasting;Content based retrieval;Geometry;Layout;Multimedia communication;Principal component analysis;Robustness;Solid modeling;Stochastic processes;Streaming media},\ndoi =  {10.1109/ICASSP.2003.1199536}, \nabstract = {In recent years, there has been increasing work in the area of content retrieval for sports. The idea is generally to extract important events or create summaries to allow personalisation of the media stream. While previous work in sports analysis has employed either the audio or video stream to achieve some goal, there is little work that explores how much can be achieved by combining the two streams. This paper combines both audio and image features to identify the key episode in tennis broadcasts. The image feature is based on image moments and is able to capture the essence of scene geometry without recourse to 3D modelling. The audio feature uses PCA to identify the sound of the ball hitting the racket. The features are modelled as stochastic processes and the work combines the features using a likelihood approach. The results show that combining the features yields a much more robust system than using the features separately.},\nurl = {http://www.tara.tcd.ie/bitstream/handle/2262/81765/final_icassp03.pdf},\nnote = {URI: http://hdl.handle.net/2262/81765},\nISSN =  {1520-6149}, month =  {April}}\n\n
\n
\n\n\n
\n In recent years, there has been increasing work in the area of content retrieval for sports. The idea is generally to extract important events or create summaries to allow personalisation of the media stream. While previous work in sports analysis has employed either the audio or video stream to achieve some goal, there is little work that explores how much can be achieved by combining the two streams. This paper combines both audio and image features to identify the key episode in tennis broadcasts. The image feature is based on image moments and is able to capture the essence of scene geometry without recourse to 3D modelling. The audio feature uses PCA to identify the sound of the ball hitting the racket. The features are modelled as stochastic processes and the work combines the features using a likelihood approach. The results show that combining the features yields a much more robust system than using the features separately.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Simultaneous Luminance and Position Stabilization for Film and Video.\n \n \n \n \n\n\n \n Kokaram, A. C.; Dahyot, R.; Pitie, F.; and Denman, H.\n\n\n \n\n\n\n In Proc.SPIE Visual Communications and Image Processing, volume 5022, pages 5022 - 5022 - 12, 2003. \n \n\n\n\n
\n\n\n\n \n \n \"SimultaneousPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Kokaram_VCIP03, \ntitle =  {Simultaneous Luminance and Position Stabilization for Film and Video},\nauthor =  {A. C. Kokaram and R. Dahyot and F. Pitie and H. Denman},\nbooktitle =  {Proc.SPIE Visual Communications and Image Processing}, \nvolume =  {5022}, \nnumber =  {}, \npages =  {5022 - 5022 - 12}, \nyear =  {2003},\ndoi =  {10.1117/12.476584}, \nabstract = {Temporal and spatial random variation of luminance in images, or 'flicker' is a typical degradation observed in archived film and video. The underlying premise in typical flicker reduction algorithms is that each image must be corrected for a spatially varying gain and offset. These parameters are estimated in the stationary region of the image. Hence the performance of that algorithm depends crucially on the identification of stationary image regions. Position fluctuations are also a common artefact resulting in a random 'shake' of each film frame. For removing both, the key is to reject regions showing local motion or other outlier activity. Parameters are then estimated mostly on that part of the image undergoing the dominant motion. A new algorithm that simultaneously deals with global motion estimation and flicker is presented. The final process is based on a robust application of weighted least-squares, in which the weights also classify portions of the image as local or global. The paper presents results on severely degraded sequences showing evidence of both Flicker and random shake.},\nurl =  {https://roznn.github.io/PDF/vcip2003_kokaram_pitie.pdf}, \neprint =  {}}\n\n
\n
\n\n\n
\n Temporal and spatial random variation of luminance in images, or 'flicker' is a typical degradation observed in archived film and video. The underlying premise in typical flicker reduction algorithms is that each image must be corrected for a spatially varying gain and offset. These parameters are estimated in the stationary region of the image. Hence the performance of that algorithm depends crucially on the identification of stationary image regions. Position fluctuations are also a common artefact resulting in a random 'shake' of each film frame. For removing both, the key is to reject regions showing local motion or other outlier activity. Parameters are then estimated mostly on that part of the image undergoing the dominant motion. A new algorithm that simultaneously deals with global motion estimation and flicker is presented. The final process is based on a robust application of weighted least-squares, in which the weights also classify portions of the image as local or global. The paper presents results on severely degraded sequences showing evidence of both Flicker and random shake.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Sport video shot segmentation and classification.\n \n \n \n \n\n\n \n Dahyot, R.; Rea, N.; and Kokaram, A. C.\n\n\n \n\n\n\n In Proc. SPIE Visual Communications and Image Processing 2003, volume 5150, pages 5150 - 5150 - 10, 2003. \n \n\n\n\n
\n\n\n\n \n \n \"SportPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_VCIP03, \nauthor =  {Rozenn  Dahyot and Niall  Rea and Anil C. Kokaram}, \ntitle =  {Sport video shot segmentation and classification}, \nbooktitle =  {Proc. SPIE Visual Communications and Image Processing 2003},\nvolume =  {5150}, \nnumber =  {}, \npages =  {5150 - 5150 - 10}, \nyear =  {2003},\ndoi =  {10.1117/12.503127}, \nabstract = {This paper considers the statistics of local appearance based measures that are suitable for the visual parsing of sport events. The moments of the colour information are computed, and the shape content in the frames is characterised by the moments of local shape measures. Their generation process is very low cost. The temporal evolution of the features then is modelled with a Hidden Markov Model. The HMM is used to generate higher level information by classifying the shots as close ups, court views, crowd shots and so on. The paper illustrates how those simple features, coupled with the HMM, can be used for parsing snooker and tennis footages. },\neprint =  {},\nURL =  {http://www.tara.tcd.ie/bitstream/handle/2262/37046/Sport%20Video%20Shot.pdf} \n}\n\n
\n
\n\n\n
\n This paper considers the statistics of local appearance based measures that are suitable for the visual parsing of sport events. The moments of the colour information are computed, and the shape content in the frames is characterised by the moments of local shape measures. Their generation process is very low cost. The temporal evolution of the features then is modelled with a Hidden Markov Model. The HMM is used to generate higher level information by classifying the shots as close ups, court views, crowd shots and so on. The paper illustrates how those simple features, coupled with the HMM, can be used for parsing snooker and tennis footages. \n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Suppression du bruit de pompage dans les videos.\n \n \n \n \n\n\n \n Pitie, F.; Dahyot, R.; and Kokaram, A.\n\n\n \n\n\n\n In proceedings of GRETSI conference on signal and image processing, Paris, France, September 2003. \n \n\n\n\n
\n\n\n\n \n \n \"SuppressionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{PitieGRETSI2003, \ntitle =  {Suppression du bruit de pompage dans les videos}, \nauthor =  {F. Pitie and R. Dahyot and A. Kokaram},\nbooktitle =  {proceedings of GRETSI conference on signal and image processing}, \nmonth =  {September}, \nyear =  {2003},\nabstract = {La variation temporelle de la luminance dans les sequences d'images, ou effet de pompage, est une dégradation typique des archives videos et cinematographiques. Nous proposons ici un nouveau procede qui vise à supprimer ces perturbations visuellement désagréables. Plusieurs améliorations sont proposées à la fois sur le modèle de pompage, l'estimation des paramètres correspondants et sur la méthode de compensation des images. Les expériences menées sur des videos, dont l'une est particulièrement dégradée, permettent de montrer l'apport de notre système de restauration par rapport aux méthodes existantes.},\naddress =  {Paris, France},\ndoi =  {2042/13630}, \nurl =  {http://documents.irevues.inist.fr/bitstream/handle/2042/13630/A275.pdf}}\n\n
\n
\n\n\n
\n La variation temporelle de la luminance dans les sequences d'images, ou effet de pompage, est une dégradation typique des archives videos et cinematographiques. Nous proposons ici un nouveau procede qui vise à supprimer ces perturbations visuellement désagréables. Plusieurs améliorations sont proposées à la fois sur le modèle de pompage, l'estimation des paramètres correspondants et sur la méthode de compensation des images. Les expériences menées sur des videos, dont l'une est particulièrement dégradée, permettent de montrer l'apport de notre système de restauration par rapport aux méthodes existantes.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2002\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n Comparison of Global motion estimators .\n \n \n \n\n\n \n Delacourt, P.; Kokaram, A.; and Dahyot, R.\n\n\n \n\n\n\n In proceedings of Irish Signals and Systems Conference, Cork, Ireland, June 2002. \n \n\n\n\n
\n\n\n\n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_ISS02, \nauthor =  {P. Delacourt and A. Kokaram and R. Dahyot}, \ntitle =  {Comparison of Global motion estimators }, \nbooktitle =  {proceedings of Irish Signals and Systems Conference}, \nmonth =  {June},\nyear =  {2002}, \naddress =  {Cork, Ireland}}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2001\n \n \n (4)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Analyse d'images séquentielles de scènes routières par modèles d'apparence pour la gestion du réseau routier (Appearance based road scene video analysis for the management of the road network).\n \n \n \n \n\n\n \n Dahyot, R.\n\n\n \n\n\n\n Ph.D. Thesis, University of Strasbourg I, France, November 2001.\n (published in French)\n\n\n\n
\n\n\n\n \n \n \"AnalysePaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@PHDTHESIS{Dahyot01, \nauthor =  {Rozenn Dahyot}, \ntitle =  {Analyse d'images s\\'{e}quentielles de sc\\`{e}nes routi\\`{e}res par mod\\`{e}les d'apparence pour la gestion du r\\'{e}seau routier (Appearance based road scene video analysis for the management of the road network)}, \nschool =  {University of Strasbourg I}, \naddress =  {France}, \nyear =  {2001}, \nmonth =  {November}, \nnote =  {(published in French)}, \nurl={https://roznn.github.io/PDF/mem_dahyot.pdf},\n%url = {https://publication-theses.unistra.fr/public/theses_doctorat/2001/DAHYOT_Rozenn_2001.pdf},\neprint =  {http://theses.fr/2001STR13130}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Détection d'événements dans les séquences d'images avec caméra en mouvement.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In proceedings of GRETSI conference on signal and image processing, Toulouse, France, September 2001. \n \n\n\n\n
\n\n\n\n \n \n \"DétectionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyota_gretsi01_event, \nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz},\ntitle =  {D\\'{e}tection d'\\'{e}v\\'{e}nements dans les s\\'{e}quences d'images avec cam\\'{e}ra  en mouvement},\nbooktitle =  {proceedings of GRETSI conference on signal and image processing}, \nmonth =  {September},\nabstract = {La détection de changements dans les séquences d'images s'est principalement intéressée à la détection d'objets mobiles quand le système d'acquisition est statique, ou à la détection d'effets de production, comme les changements de plans. Lorsque la caméra est mobile, son mouvement est classiquement géré par compensation du mouvement dominant, ce qui met en oeuvre des techniques d'estimation de mouvement et/ou de segmentation. Dans cet article, nous proposons une nouvelle méthode de détection de changements statistiques capable de gérer des événements complexes tels que l'entrée ou la sortie d'objets, et le changement d'apparence d'objets quand la caméra est en mouvement. Les changements temporels sont extraits en analysant les distributions statistiques d'images successives. Si l'on considère des mesures appropriées, nous montrons comment extraire les statistiques des objets changeants en utilisant deux histogrammes d'images successives. Ces objets sont ensuite localisés par une technique de rétroprojection. La méthode est complètement non supervisée et ne nécessite ni estimation, ni compensation du mouvement. Elle est illustrée sur des images de scènes routières présentant de grands mouvements de caméra.},\nurl = {http://documents.irevues.inist.fr/bitstream/handle/2042/13333/PAPER188.pdf},\nyear =  {2001}, \naddress =  {Toulouse, France},\ndoi =  {2042/13333}}\n\n
\n
\n\n\n
\n La détection de changements dans les séquences d'images s'est principalement intéressée à la détection d'objets mobiles quand le système d'acquisition est statique, ou à la détection d'effets de production, comme les changements de plans. Lorsque la caméra est mobile, son mouvement est classiquement géré par compensation du mouvement dominant, ce qui met en oeuvre des techniques d'estimation de mouvement et/ou de segmentation. Dans cet article, nous proposons une nouvelle méthode de détection de changements statistiques capable de gérer des événements complexes tels que l'entrée ou la sortie d'objets, et le changement d'apparence d'objets quand la caméra est en mouvement. Les changements temporels sont extraits en analysant les distributions statistiques d'images successives. Si l'on considère des mesures appropriées, nous montrons comment extraire les statistiques des objets changeants en utilisant deux histogrammes d'images successives. Ces objets sont ensuite localisés par une technique de rétroprojection. La méthode est complètement non supervisée et ne nécessite ni estimation, ni compensation du mouvement. Elle est illustrée sur des images de scènes routières présentant de grands mouvements de caméra.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Détection robuste d'objets : une approche par modele d'apparence.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In proceedings of GRETSI conference on signal and image processing, Toulouse, France, September 2001. \n \n\n\n\n
\n\n\n\n \n \n \"DétectionPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_gretsi01_robust, \nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz}, \ntitle =  {D\\'{e}tection robuste d'objets : une approche par modele d'apparence}, \nbooktitle =  {proceedings of GRETSI conference on signal and image processing}, \nabstract = {Les méthodes classiques de détection basées sur la représentation de l'apparence par espace propre sont sensibles à la présence d'erreurs grossières dans les observations, induites, par exemple, par des occultations. Récemment, l'utilisation de techniques issues des statistiques robustes, les M-estimateurs, ont permis de gérer la présence de ces données erronées dans le cadre de la reconnaissance d'objets. Nous proposons dans cet article d'étendre cette approche robuste pour définir deux nouveaux détecteurs, capables de localiser les occurrences dégradées ou occultées d'objets d'intérêt dans des scènes texturées.},\nmonth =  {September}, \nyear =  {2001}, \naddress =  {Toulouse, France},\nurl = {http://documents.irevues.inist.fr/bitstream/handle/2042/13335/PAPER191.pdf},\ndoi =  {2042/13335}}\n\n
\n
\n\n\n
\n Les méthodes classiques de détection basées sur la représentation de l'apparence par espace propre sont sensibles à la présence d'erreurs grossières dans les observations, induites, par exemple, par des occultations. Récemment, l'utilisation de techniques issues des statistiques robustes, les M-estimateurs, ont permis de gérer la présence de ces données erronées dans le cadre de la reconnaissance d'objets. Nous proposons dans cet article d'étendre cette approche robuste pour définir deux nouveaux détecteurs, capables de localiser les occurrences dégradées ou occultées d'objets d'intérêt dans des scènes texturées.\n
\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Unsupervised statistical detection of changing objects in camera-in-motion video.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In Proceedings 2001 International Conference on Image Processing, volume 1, pages 638-641, 2001. \n Github: https://github.com/Roznn/Detection-of-Changing-Objects-in-Camera-in-Motion-Video\n\n\n\n
\n\n\n\n \n \n \"UnsupervisedPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_icip01,\nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz},\nbooktitle =  {Proceedings 2001 International Conference on Image Processing}, \ntitle =  {Unsupervised statistical detection of changing objects in camera-in-motion video},\nyear =  {2001}, \nvolume =  {1}, \nnumber =  {}, \npages =  {638-641}, \nkeywords =  {feature extraction;image sequences;statistical analysis;backprojection;camera motion;camera-in-motion video;change detection;entering objects;exiting objects;image features;image histograms;image sequences;moving objects;object appearance;road scenes;unsupervised statistical detection;Cameras;Event detection;Gunshot detection systems;Image analysis;Image segmentation;Image sequences;Layout;Motion estimation;Object detection;Production systems}, \ndoi =  {10.1109/ICIP.2001.959126}, \nurl = {https://github.com/Roznn/Detection-of-Changing-Objects-in-Camera-in-Motion-Video/blob/master/paper/htm_icip2001.pdf},\nnote = {Github: https://github.com/Roznn/Detection-of-Changing-Objects-in-Camera-in-Motion-Video},\nISSN =  {}, \nmonth =  {}}\n\n
\n
\n\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 2000\n \n \n (1)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Robust visual recognition of colour images.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 685-690 vol.1, 2000. \n \n\n\n\n
\n\n\n\n \n \n \"RobustPaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_cvpr00, \nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz}, \nbooktitle =  {Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\ntitle =  {Robust visual recognition of colour images}, \nyear =  {2000}, \nvolume =  {1}, \nnumber =  {}, \npages =  {685-690 vol.1},\nkeywords =  {estimation theory;image recognition;image representation;appearance-based representation;colour images;pattern recognition;robust estimation;visual recognition;weighted least squares;Databases;Electrical capacitance tomography;Equations;Image recognition;Image reconstruction;Image segmentation;Least squares methods;Parameter estimation;Pattern recognition;Robustness}, \ndoi =  {10.1109/CVPR.2000.855886}, \nurl = {https://roznn.github.io/PDF/htm_Cvpr00.pdf},\nabstract = {In this paper a robust pattern recognition system, using an appearance-based representation of colour images is described. Standard appearance-based approaches are not robust to outliers, occlusions or segmentation errors. The approach proposed here relies on robust M-estimators, involving non-quadratic and possibly non-convex energy functions. To deal with the minimisation of non-convex functions in a deterministic framework, we introduce an estimation scheme relying on M-estimators used in continuation, from convex functions to hard redescending nonconvex estimators. At each step of the robust estimation scheme, the non-quadratic criterion is minimized using the half-quadratic theory. This leads to a weighted least squares algorithm, which is easy to implement. The proposed robust estimation scheme does not require any user interaction because all necessary parameters are previously estimated. The method is illustrated on a road sign recognition application. Experiments show significant improvements with respect to standard estimation schemes.},\nISSN =  {1063-6919}, \nmonth =  {}}\n\n
\n
\n\n\n
\n In this paper a robust pattern recognition system, using an appearance-based representation of colour images is described. Standard appearance-based approaches are not robust to outliers, occlusions or segmentation errors. The approach proposed here relies on robust M-estimators, involving non-quadratic and possibly non-convex energy functions. To deal with the minimisation of non-convex functions in a deterministic framework, we introduce an estimation scheme relying on M-estimators used in continuation, from convex functions to hard redescending nonconvex estimators. At each step of the robust estimation scheme, the non-quadratic criterion is minimized using the half-quadratic theory. This leads to a weighted least squares algorithm, which is easy to implement. The proposed robust estimation scheme does not require any user interaction because all necessary parameters are previously estimated. The method is illustrated on a road sign recognition application. Experiments show significant improvements with respect to standard estimation schemes.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n
\n
\n  \n 1999\n \n \n (2)\n \n \n
\n
\n \n \n
\n \n\n \n \n \n \n \n \n Non-Supervised Robust Visual Recognition of Colour Images using Half-Quadratic Theory.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In proceedings of European Workshop on Content-Based Multimedia Indexing (CBMI), Toulouse, France, October 1999. \n \n\n\n\n
\n\n\n\n \n \n \"Non-SupervisedPaper\n  \n \n\n \n\n \n link\n  \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_cbmi99, \nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz}, \ntitle =  {Non-Supervised Robust Visual Recognition of Colour Images  using Half-Quadratic Theory}, \nbooktitle =  {proceedings of European Workshop on Content-Based Multimedia Indexing (CBMI)},\nurl = {https://roznn.github.io/PDF/htm_cbmi99.pdf},\nmonth =  {October}, \nyear =  {1999}, \naddress =  {Toulouse, France}}\n\n
\n
\n\n\n\n
\n\n\n
\n \n\n \n \n \n \n \n \n Reconnaissance robuste non supervisée d'images en couleur utilisant la théorie semi-quadratique.\n \n \n \n \n\n\n \n Dahyot, R.; Charbonnier, P.; and Heitz, F.\n\n\n \n\n\n\n In proceedings of GRETSI conference on signal and image processing, volume 2, pages 295-298, Vannes, France, September 1999. \n \n\n\n\n
\n\n\n\n \n \n \"ReconnaissancePaper\n  \n \n\n \n \n doi\n  \n \n\n \n link\n  \n \n\n bibtex\n \n\n \n  \n \n abstract \n \n\n \n  \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n  \n \n \n\n\n\n
\n
@INPROCEEDINGS{Dahyot_gretsi99,\nauthor =  {R. Dahyot and P. Charbonnier and F. Heitz},\ntitle =  {Reconnaissance robuste non supervis\\'{e}e d'images en couleur utilisant la th\\'{e}orie semi-quadratique},\nabstract = {Cet article décrit un système robuste de reconnaissance d'objets à partir d'images en couleur. \nLes méthodes usuelles basées sur l'apparence sont sensibles aux données erronées occasionnées par des occlusions \nou des erreurs de segmentation. L'approche proposée ici utilise les M-estimateurs mettant en oeuvre des fonctions d'énergies \nnon-quadratiques voire non-convexes. Pour minimiser ces fonctions non-convexes, nous présentons un système d'estimation \nutilisant les M-estimateurs en continuation, d'une fonction convexe vers des estimateurs non-convexes. \nÀ chaque étape de cette chaîne robuste, un critère non-quadratique est minimisé grâce à la théorie semi-quadratique. \nCeci conduit à un algorithme de moindres carrés pondérés facile à implémenter, peu coûteux et non supervisé (tous les paramètres \nétant estimés automatiquement). Cette méthode est illustrée ici dans un problème de reconnaissance de panneaux routiers.},\nbooktitle =  {proceedings of GRETSI conference on signal and image processing}, \nvolume =  {2}, \npages =  {295-298},\nmonth =  {September}, \nyear =  {1999}, \nurl = {http://documents.irevues.inist.fr/bitstream/handle/2042/12964/ARTI1293.pdf},\naddress =  {Vannes, France}, \ndoi =  {2042/12964}}\n\n
\n
\n\n\n
\n Cet article décrit un système robuste de reconnaissance d'objets à partir d'images en couleur. Les méthodes usuelles basées sur l'apparence sont sensibles aux données erronées occasionnées par des occlusions ou des erreurs de segmentation. L'approche proposée ici utilise les M-estimateurs mettant en oeuvre des fonctions d'énergies non-quadratiques voire non-convexes. Pour minimiser ces fonctions non-convexes, nous présentons un système d'estimation utilisant les M-estimateurs en continuation, d'une fonction convexe vers des estimateurs non-convexes. À chaque étape de cette chaîne robuste, un critère non-quadratique est minimisé grâce à la théorie semi-quadratique. Ceci conduit à un algorithme de moindres carrés pondérés facile à implémenter, peu coûteux et non supervisé (tous les paramètres étant estimés automatiquement). Cette méthode est illustrée ici dans un problème de reconnaissance de panneaux routiers.\n
\n\n\n
\n\n\n\n\n\n
\n
\n\n\n\n\n
\n\n\n \n\n \n \n \n \n\n
\n"}; document.write(bibbase_data.data);