Aller au contenu principal

DeepRare: Generic Unsupervised Visual Attention Models

Human visual system is modeled in engineering field providing feature-engineered methods which detect contrasted/surprising/unusual data into images. This data is "interesting" for humans and leads to numerous applications. Deep learning (DNNs) drastically improved the algorithms efficiency on the main benchmark datasets. However, DNN-based models are counter-intuitive: surprising or unusual data is by definition difficult to learn because of its low occurrence probability.

Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation

This work aims at generating captions for soccer videos using deep learning. In this context, this paper introduces a dataset, model, and triple-level evaluation. The dataset consists of 22k caption-clip pairs and three visual features (images, optical flow, inpainting) for ~500 hours of \emph{SoccerNet} videos. The model is divided into three parts: a transformer learns language, ConvNets learn vision, and a fusion of linguistic and visual features generates captions.

Analysis of Co-Laughter Gesture Relationship on RGB videos in Dyadic Conversation Contex

The development of virtual agents has enabled human-avatar interactions to become increasingly rich and varied. Moreover, an expressive virtual agent i.e. that mimics the natural expression of emotions, enhances social interaction between a user (human) and an agent (intelligent machine). The set of non-verbal behaviors of a virtual character is, therefore, an important component in the context of human-machine interaction.

How does explicit orientation encoding affect image classification of ConvNets?

Some shapes look different to us if rotated. That is attributed to the use of a rotation frame of coordinates in the human visual system. However, no evidence that ConvNets, which is a machine learning architecture, use a frame of coordinates for rotation. We investigated the effect of adding one to ConvNets. An explicit orientation encoding kernel was developed using a mathematically inspired self-supervised approach. The experimental results showed that rotation encoding improved the accuracy of classifying rotated images and the resilience against noise.

Are There Any Body-movement Differences between Women and Men When They Laugh?

Smiling differences between men and women have been studied in psychology. Women smile more than men although the expressiveness of women is not universally more across all facial actions. There are also body movement differences between women and men. For example, more open-body postures were reported for men, but are there any body-movement differences between men and women when they laugh? To investigate this question, we study body-movement signals extracted from recorded laughter videos using a deep learning pose estimation model.

A landscape-based analysis of fixed temperature and simulated annealing

Since the introduction of Simulated Annealing (SA), researchers have considered variants that keep the same temperature value throughout the whole search and tried to determine whether this strategy can be more effective than the original cooling scheme. Several studied have tried to answer this question without a conclusive answer and without providing indications that could be useful for a practical implementation.

Improve Convolutional Neural Network Pruning by Maximizing Filter Variety

Neural network pruning is a widely used strategy for reducing model storage and computing requirements. It allows to lower the complexity of the network by introducing sparsity in the weights. Because taking advantage of sparse matrices is still challenging, pruning is often performed in a structured way, i.e. removing entire convolution filters in the case of ConvNets, according to a chosen pruning criteria.

The role of diversity and ensemble learning in credit card fraud detection

The number of daily credit card transactions is inexorably growing: the e-commerce market expansion and the recent constraints for the Covid-19 pandemic have significantly increased the use of electronic payments. The ability to precisely detect fraudulent transactions is increasingly important, and machine learning models are now a key component of the detection process. Standard machine learning techniques are widely employed, but inadequate for the evolving nature of customers behavior entailing continuous changes in the underlying data distribution.

A Digital Twin Approach for Improving Estimation Accuracy in Dynamic Thermal Rating of Transmission Lines

The limitation of transmission lines thermal capacity plays a crucial role in the safety and reliability of power systems. Dynamic thermal line rating approaches aim to estimate the transmission line’s temperature and assess its compliance with the limitations above. Existing physics-based standards estimate the temperature based on environment and line conditions measured by several sensors. This manuscript shows that estimation accuracy can be improved by adopting a data-driven Digital Twin approach.

S'abonner à