Accepted Papers List

DM5680
IMPsys: An Intelligent Mold Processing System for Smart Factory
Xueyi Zhou, Yohan Na, Minju Bang, Dong-Kyu Chae
[+] More
[-] Less
The explosive popularity of smart manufacturing has caught the attention of researchers in terms of intelligent mold processing and management. Machining mold components is a crucial step in the mold production process for many industries, which creates (e.g., cutting, drilling, and shaping a metal) the individual parts (e.g., core pins, ejector pins, cavities, slides, and lifters) that make up a mold used in manufacturing. We present IMPsys, an AI-based system that automatically explores machining jobs, infers their processing time and schedules them on machines, given numerous 3D modelling files of mold components. Our demo video can be found at: http://bit.ly/3EeKnyL.
List of keywords
Humans and AI -> HAI: Intelligent user interfaces
Computer Vision -> CV: 3D computer vision
Humans and AI -> HAI: Human-AI collaboration
Humans and AI -> HAI: Human-computer interaction
DM5681
Matting Moments: A Unified Data-Driven Matting Engine for Mobile AIGC in Photo Gallery
Yanhao Zhang, Fanyi Wang, Weixuan Sun, Jingwen Su, Peng Liu, Yaqian Li, Xinjie Feng, Zhengxia Zou
[+] More
[-] Less
Image matting is a fundamental technique in visual understanding and has become one of the most significant capabilities in mobile phones. Despite the development of mobile storage and computing power, achieving diverse mobile Artificial Intelligence Generated Content (AIGC) applications remains a great challenge. To address this issue, we present an innovative demonstration of an automatic system called "Matting Moments" that enables automatic image editing based on matting models in different scenarios. Coupled with accurate and refined matting subjects, our system provides visual element editing abilities and backend services for distribution and recommendation that respond to emotional expressions. Our system comprises three components: 1) photo content structuring, 2) data-driven matting engine, and 3) AIGC functions for generation, which automatically achieve diverse photo beautification in the gallery. This system offers a unified framework that guides consumers to obtain intelligent recommendations with beautifully generated contents, helping them enjoy the moments and memories of their present life.
List of keywords
Computer Vision -> CV: Applications
Computer Vision -> CV: Other
Computer Vision -> CV: Segmentation
DM5686
VideoMaster: A Multimodal Micro Game Video Recreator
Yipeng Yu, Xiao Chen, Hui Zhan
[+] More
[-] Less
To free human from laborious video production, this paper proposes the building of VideoMaster, a multimodal system equipped with four capabilities: highlight extraction, video describing, video dubbing and video editing. It extracts interesting episodes from long game videos, generates subtitles for each episode, reads the subtitles through synthesized speech, and finally re-creates a better short video through video editing. Notably, VideoMaster takes a combination of deep learning and traditional computer vision techniques to extract highlights with fine-to-coarse labels, utilizes a novel framework named PCSG-v (probabilistic context sensitive grammar for video) for video description generation, and imitates a target speaker’s voice to read the description. To the best of our knowledge, VideoMaster is the first multimedia system that can automatically produce product-level micro-videos without heavy human annotation.
List of keywords
Multidisciplinary Topics and Applications -> MDA: Arts and creativity
Computer Vision -> CV: Applications
Computer Vision -> CV: Segmentation
Computer Vision -> CV: Video analysis and understanding   
Computer Vision -> CV: Vision and language 
Machine Learning -> ML: Multi-modal learning
Natural Language Processing -> NLP: Applications
Natural Language Processing -> NLP: Language generation
Natural Language Processing -> NLP: Speech
DM5691
SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy Treatment Strategies with Deep Reinforcement Learning
Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf
[+] More
[-] Less
We present a novel recommendation system designed to provide real-time treatment strategies to therapists during psychotherapy sessions. Our system utilizes a turn-level rating mechanism that forecasts the therapeutic outcome by calculating a similarity score between the profound representation of a scoring inventory and the patient’s current spoken sentence. By transcribing and segmenting the continuous audio stream into patient and therapist turns, our system conducts immediate evaluation of their therapeutic working alliance. The resulting dialogue pairs, along with their computed working alliance ratings, are then utilized in a deep reinforcement learning recommendation system. In this system, the sessions are treated as users, while the topics are treated as items. To showcase the system’s effectiveness, we not only evaluate its performance using an existing dataset of psychotherapy sessions but also demonstrate its practicality through a web app. Through this demo, we aim to provide a tangible and engaging experience of our recommendation system in action.
List of keywords
Humans and AI -> HAI: Computational sustainability and human wellbeing
Data Mining -> DM: Recommender systems
Humans and AI -> HAI: Applications
Humans and AI -> HAI: Brain sciences
Humans and AI -> HAI: Human-computer interaction
Humans and AI -> HAI: Intelligent user interfaces
Multidisciplinary Topics and Applications -> MDA: Health and medicine
Natural Language Processing -> NLP: Applications
Natural Language Processing -> NLP: Dialogue and interactive systems
DM5695
Latent Inspector: An Interactive Tool for Probing Neural Network Behaviors Through Arbitrary Latent Activation
Daniel Geißler, Bo Zhou, Paul Lukowicz
[+] More
[-] Less
This work presents an active software instrument allowing deep learning architects to interactively inspect neural network models’ output behavior from user-manipulated values in any latent layer. Latent Inspector offers multiple dimension reduction techniques to visualize the model’s high dimensional latent layer output in human-perceptible, two-dimensional plots. The system is implemented with Node.js front end for interactive user input and Python back end for interacting with the model. By utilizing a general and modular architecture, our proposed solution dynamically adapts to a versatile range of models and data structures. Compared to already existing tools, our asynchronous approach of separating the training process from the inspection offers additional possibilities, such as interactive data generation, by actively working with the model instead of visualizing training logs. Overall, Latent Inspector demonstrates the possibilities as well as the appearing limits for providing a generalized, tool-based concept for enhancing model insight in terms of explainable and transparent AI.
List of keywords
Machine Learning -> ML: Explainable/Interpretable machine learning
AI Ethics, Trust, Fairness -> ETF: Trustworthy AI
Data Mining -> DM: Data visualization
Data Mining -> DM: Exploratory data mining
Humans and AI -> HAI: Human-computer interaction
Machine Learning -> ML: Feature extraction, selection and dimensionality reduction
DM5696
Automated Planning for Generating and Simulating Traffic Signal Strategies
Saumya Bhatnagar, Rongge Guo, Keith McCabe, Thomas McCluskey, Francesco Percassi, Mauro Vallati
[+] More
[-] Less
There is a growing interest in the use of AI techniques for urban traffic control, with a particular focus on traffic signal optimisation. Model-based approaches such as planning demonstrated to be capable of dealing in real-time with unexpected or unusual traffic conditions, as well as with the usual traffic patterns. Further, the knowledge models on which such techniques rely to generate traffic signal strategies are in fact simulation models of traffic, hence can be used by traffic authorities to test and compare different approaches. In this work, we present a framework that relies on automated planning to generate and simulate traffic signal strategies in a urban region. To demonstrate the capabilities of the framework, we consider real-world data collected from sensors deployed in a major corridor of the Kirklees region of the United Kingdom.
List of keywords
Planning and Scheduling -> PS: Applications
Planning and Scheduling -> PS: Mixed discrete/continuous planning
DM5703
Fedstellar: A Platform for Training Models in a Privacy-preserving and Decentralized Fashion
Enrique Tomás Martínez Beltrán, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán
[+] More
[-] Less
This paper presents Fedstellar, a platform for training decentralized Federated Learning (FL) models in heterogeneous topologies in terms of the number of federation participants and their connections. Fedstellar allows users to build custom topologies, enabling them to control the aggregation of model parameters in a decentralized manner. The platform offers a Web application for creating, managing, and connecting nodes to ensure data privacy and provides tools to measure, monitor, and analyze the performance of the nodes. The paper describes the functionalities of Fedstellar and its potential applications. To demonstrate the applicability of the platform, different use cases are presented in which decentralized, semi-decentralized, and centralized architectures are compared in terms of model performance, convergence time, and network overhead when collaboratively classifying hand-written digits using the MNIST dataset.
List of keywords
Machine Learning -> ML: Federated learning
Machine Learning -> ML: Applications
Machine Learning -> ML: Classification
Machine Learning -> ML: Evaluation
DM5705
SemFORMS: Automatic Generation of Semantic Transforms By Mining Data Science Code
Ibrahim Abdelaziz, Julian Dolby, Udayan Khurana, Horst Samulowitz, Kavitha Srinivas
[+] More
[-] Less
Careful choice of feature transformations in a dataset can help predictive model performance, data understanding and data exploration. However, finding useful features is a challenge, and while recent Automated Machine Learning (AutoML) systems provide some limited automation for feature engineering or data exploration, it is still mostly done by humans. We demonstrate a system called SemFORMS (Semantic Transforms), which attempts to mine useful expressions for a dataset from access to a repository of code that may target the same dataset/similar dataset. In many enterprises, numerous data scientists often work on the same or similar datasets, but are largely unaware of each other’s work. SemFORMS finds appropriate code from such a repository, and normalizes the code to be an actionable transform that can prepended into any AutoML pipeline. We demonstrate SemFORMS operating over example datasets from the OpenML benchmarks where it sometimes leads to significant improvements in AutoML performance.
List of keywords
Machine Learning -> ML: Automated machine learning
DM5712
LingGe: An Automatic Ancient Chinese Poem-to-Song Generation System
Yong Shan, Jinchao Zhang, Huiying Ren, Yao Qiu, Jie Zhou
[+] More
[-] Less
This paper presents a novel system, named LingGe ("伶歌" in Chinese), to generate songs for ancient Chinese poems automatically. LingGe takes the poem as the lyric, composes music conditioned on the lyric, and finally outputs a full song including the singing and the accompaniment. It consists of four modules: rhythm recognition, melody generation, accompaniment generation, and audio synthesis. Firstly, the rhythm recognition module analyzes the song structure and rhythm according to the poem. Secondly, the melody generation module assembles the rhythm into the template and then generates the melody. Thirdly, the accompaniment generation module predicts the accompaniment in harmony with the melody. Finally, the audio synthesis module generates singing and accompaniment audio and then mixes them to obtain songs. The results show that LingGe can generate high-quality and expressive songs for ancient Chinese poems, both in harmony and rhythm.
List of keywords
Multidisciplinary Topics and Applications -> MDA: Arts and creativity
Multidisciplinary Topics and Applications -> MDA: Other
DM5718
Modeling the Impact of Policy Interventions for Sustainable Development
Sowmith Nandan Rachuri, Arpitha Malavalli, Niharika Sri Parasa, Pooja Bassin, Srinath Srinivasa
[+] More
[-] Less
There is an increasing demand to design policy interventions to achieve various targets specified by the UN Sustainable Development Goals by 2030. Designing interventions is a complex task given that the system may often respond in unexpected ways to a given intervention. This could be due to interventions towards a given target, affecting other unrelated variables, and/or interventions leading to acute disparities in nearby geographic areas. In order to address such issues, we propose a novel concept called Stress Modeling that analyzes the holistic impact of a policy intervention by taking into account the interactions within a system, after the intervention. The simulation is based on the postulate that complex systems of interacting entities tend to settle down into "low energy” configurations by minimizing differentials in capabilities of neighbouring entities. The simulation shows how policy impact percolates through geospatial boundaries over time and can be applied at any granularity. The theory and the corresponding package have been explained along with a case study analyzing a fertilizer policy in the Agro-climatic Zones of the state of Karnataka, India.
List of keywords
Multidisciplinary Topics and Applications -> MDA: Computational sustainability
AI Ethics, Trust, Fairness -> ETF: AI and law, governance, regulation
Machine Learning -> ML: Explainable/Interpretable machine learning
Multidisciplinary Topics and Applications -> MDA: Energy, environment and sustainability
Uncertainty in AI -> UAI: Bayesian networks
DM5719
Optimized Crystallographic Graph Generation for Material Science
Astrid Klipfel, Yaël Frégier, Adlane Sayede, Zied Bouraoui
[+] More
[-] Less
Graph neural networks are widely used in machine learning applied to chemistry, and in particular for material science discovery. For crystalline materials, however, generating graph-based representation from geometrical information for neural networks is not a trivial task. The periodicity of crystalline needs efficient implementations to be processed in real-time under a massively parallel environment. With the aim of training graph-based generative models of new material discovery, we propose an efficient tool to generate cutoff graphs and k-nearest-neighbours graphs of periodic structures within GPU optimization. We provide pyMatGraph a Pytorch-compatible framework to generate graphs in real-time during the training of neural network architecture. Our tool can update a graph of a structure, making generative models able to update the geometry and process the updated graph during the forward propagation on the GPU side. Our code is publicly available at https://github.com/aklipf/mat-graph.
List of keywords
Multidisciplinary Topics and Applications -> MDA: Energy, environment and sustainability
Multidisciplinary Topics and Applications -> MDA: Life sciences
Multidisciplinary Topics and Applications -> MDA: Physical sciences
Machine Learning -> ML: Feature extraction, selection and dimensionality reduction
DM5722
mahaNLP: A Marathi Natural Language Processing Library
Vidula Magdum, Omkar Dhekane, Sharayu Hiwarkhedkar, Saloni Mittal, Raviraj Joshi
[+] More
[-] Less
We present mahaNLP, an open-source natural language processing (NLP) library specifically built for the Marathi language. It aims to enhance the support for the low-resource Indian language Marathi in the field of NLP. It is an easy-to-use, extensible and modular toolkit for Marathi text analysis built on state-of-the-art transformer models. In comparison to other existing Indic NLP libraries that support basic Marathi processing, this toolkit houses an extensive set of NLP tasks ranging from basic preprocessing tasks to advanced NLP tasks. Additionally, it provides functionality to load datasets for supervised tasks like Marathi sentiment analysis, NER, and Hate speech detection as data frames. This paper focuses on the overview of the mahaNLP framework, its features, and its usage. This work is a part of the L3Cube MahaNLP initiative, more information about it can be found at https://github.com/l3cube-pune/MarathiNLP and the demonstration video and file of mahaNLP are available at https://youtu.be/KxExcwCrTO0 and https://cutt.ly/f1FYQak respectively.
List of keywords
Natural Language Processing -> NLP: Tools
Natural Language Processing -> NLP: Applications
Natural Language Processing -> NLP: Language models
Natural Language Processing -> NLP: Sentiment analysis, stylistic analysis, and argument mining
Natural Language Processing -> NLP: Named entities
Natural Language Processing -> NLP: Text classification
Natural Language Processing -> NLP: Information retrieval and text mining
DM5728
SiWare: Contextual Understanding of Industrial Data for Situational Awareness
Anuradha Bhamidipaty, Elham Khabiri, Bhavna Agrawal, Yingjie Li
[+] More
[-] Less
SiWare is an AI-powered Knowledge Discovery system, that helps unlock new insights and accelerates data-driven decisions with contextualized Industrial data. SiWare links and fuses heterogeneous data sources with an industry semantic model leveraging multiple AI capabilities to provide system-wide visibility into operational characteristics. As part of this demo paper, we describe the requirements for such a system, and deployment aspects, and demonstrate the benefits in two industrial scenarios.
List of keywords
Data Mining -> DM: Mining heterogenous data
Data Mining -> DM: Knowledge graphs and knowledge base completion
Natural Language Processing -> NLP: Applications
Knowledge Representation and Reasoning -> KRR: Applications
DM5729
NeoMaPy: A Framework for Computing MAP Inference on Temporal Knowledge Graphs
Victor David, Raphael Fournier-S’niehotta, Nicolas Travers
[+] More
[-] Less
Markov Logic Networks (MLN) are used for reasoning on uncertain and inconsistent temporal data. We proposed the TMLN (Temporal Markov Logic Network) which extends them with sorts/types, weights on rules and facts, and various temporal consistencies. The NeoMaPy framework integrates it as a knowledge graph based on conflict graphs which offers flexibility for reasoning with parametric Maximum A Posteriori (MAP) inferences, efficiency with an optimistic heuristic and interactive graph visualization for results explanation.
List of keywords
Knowledge Representation and Reasoning -> KRR: Reasoning about knowledge and belief
Knowledge Representation and Reasoning -> KRR: Applications
Multidisciplinary Topics and Applications -> MDA: Databases
Planning and Scheduling -> PS: Markov decisions processes
DM5731
Understanding the Night-Sky? Developing AI-Enabled System for Exploring Night-Light Usage Patterns
Jakob Hederich, Shreya Ghosh, Zeyu He, Prasenjit Mitra
[+] More
[-] Less
We present a demonstration of nighttime light pattern (NTL) analysis system. Our tool named NightVIEW is powered by an efficient system architecture to easily export and analyse a huge volume of spatial data (NTL), image segmentation and clustering algorithms to find unusual NTL patterns and identify hotspots of excess night light usage as well as finding semantics of cities.
List of keywords
Data Mining -> DM: Data visualization
Data Mining -> DM: Applications
Data Mining -> DM: Mining spatial and/or temporal data
DM5732
Humming2Music: Being A Composer As Long As You Can Humming
Yao Qiu, Jinchao Zhang, Huiying Ren, Yong Shan, Jie Zhou
[+] More
[-] Less
Creating a piece of music is difficult for people who have never been trained to compose. We present an automatic music generation system to lower the threshold of creating music. The system takes the user’s humming as input and creates full music based on the humming melody. The system consists of five modules: 1) humming transcription, 2) melody generation, 3) broken chord generation, 4) accompaniment generation, and 5) audio synthesis. The first module transcribes the user’s humming audio to a score, and then the melody generation module composes a complete melody based on the user’s humming melody. After that, the third module will generate a broken chord track to accompany the full melody, and the fourth module will create more accompanying tracks. Finally, the audio synthesis module mixes all the tracks to generate the music. Through the user experiment, our system can generate high-quality music with natural expression based on the user’s humming input.
List of keywords
Multidisciplinary Topics and Applications -> MDA: Arts and creativity
Multidisciplinary Topics and Applications -> MDA: Other
DM5735
Bias On Demand: Investigating Bias with a Synthetic Data Generator
Joachim Baumann, Alessandro Castelnovo, Andrea Cosentini, Riccardo Crupi, Nicole Inverardi, Daniele Regoli
[+] More
[-] Less
Machine Learning (ML) systems are increasingly being adopted to make decisions that might have a significant impact on people’s lives. Because these decision-making systems rely on data-driven learning, the risk is that they will systematically propagate the bias embedded in the data. To prevent harmful consequences, it is essential to comprehend how and where bias is introduced and possibly how to mitigate it. We demonstrate Bias on Demand, a framework to generate synthetic datasets with different types of bias, which is available as an open-source toolkit and as a pip package. We include a demo of our proposed synthetic data generator, in which we illustrate experiments on different scenarios to showcase the interconnection between biases and their effect on performance and fairness evaluations. We encourage readers to explore the full paper for a more detailed analysis.
List of keywords
AI Ethics, Trust, Fairness -> ETF: Bias
AI Ethics, Trust, Fairness -> ETF: Trustworthy AI
DM5739
Practical Model Reductions for Verification of Multi-Agent Systems
Wojciech Jamroga, Yan Kim
[+] More
[-] Less
Formal verification of intelligent agents is often computationally infeasible due to state-space explosion. We present a tool for reducing the impact of the explosion by means of state abstraction that is (a) easy to use and understand by non-experts, and (b) agent-based in the sense that it operates on a modular representation of the system, rather than on its huge explicit state model.
List of keywords
Agent-based and Multi-agent Systems -> MAS: Engineering methods, platforms, languages and tools
Agent-based and Multi-agent Systems -> MAS: Applications
Agent-based and Multi-agent Systems -> MAS: Formal verification, validation and synthesis
DM5740
A Human-in-the-Loop Tool for Annotating Passive Acoustic Monitoring Datasets
Hannes Kath, Thiago S. Gouvêa, Daniel Sonntag
[+] More
[-] Less
Deep learning methods are well suited for data analysis in several domains, but application is often limited by technical entry barriers and the availability of large annotated datasets. We present an interactive machine learning tool for annotating passive acoustic monitoring datasets created for wildlife monitoring, which are time-consuming and costly to annotate manually. The tool, designed as a web application, consists of an interactive user interface implementing a human-in-the-loop workflow. Class label annotations provided manually as bounding boxes drawn over a spectrogram are consumed by a deep generative model (DGM) that learns a low-dimensional representation of the input data, as well as the available class labels. The learned low-dimensional representation is displayed as an interactive interface element, where new bounding boxes can be efficiently generated by the user with lasso-selection; alternatively, the DGM can propose new, automatically generated bounding boxes on demand. The user can accept, edit, or reject annotations suggested by the model, thus owning final judgement. Generated annotations can be used to fine-tune the underlying model, thus closing the loop. Investigations of the prediction accuracy and first empirical experiments show promising results on an artificial data set, laying the ground for application to a real life scenario.
List of keywords
Machine Learning -> ML: Feature extraction, selection and dimensionality reduction
Multidisciplinary Topics and Applications -> MDA: Energy, environment and sustainability
Machine Learning -> ML: Autoencoders
Machine Learning -> ML: Incremental learning
Machine Learning -> ML: Representation learning
Machine Learning -> ML: Classification
Machine Learning -> ML: Active learning
DM5741
AutoML for Outlier Detection with Optimal Transport Distances
Prabhant Singh, Joaquin Vanschoren
[+] More
[-] Less
Automated machine learning (AutoML) has been widely researched and adopted for supervised problems, but progress in unsupervised settings has been limited. We propose `"LOTUS", a novel framework to automate outlier detection based on meta-learning. Our premise is that the selection of the optimal outlier detection technique depends on the inherent properties of the data distribution. We leverage optimal transport to find the dataset with the most similar underlying distribution, and then apply the outlier detection techniques that proved to work best for that data distribution. We evaluate the robustness of our framework and find that it outperforms all state-of-the-art automated outlier detection tools. This approach can also be easily generalized to automate other unsupervised settings.
List of keywords
Data Mining -> DM: Anomaly/outlier detection
Machine Learning -> ML: Automated machine learning
DM5742
Plansformer Tool: Demonstrating Generation of Symbolic Plans Using Transformers
Vishal Pallagani, Bharath Muppasani, Biplav Srivastava, Francesca Rossi, Lior Horesh, Keerthiram Murugesan, Andrea Loreggia, Francesco Fabiano, Rony Joseph, Yathin Kethepalli
[+] More
[-] Less
Plansformer is a novel tool that utilizes a fine-tuned language model based on transformer architecture to generate symbolic plans. Transformers are a type of neural network architecture that have been shown to be highly effective in a range of natural language processing tasks. Unlike traditional planning systems that use heuristic-based search strategies, Plansformer is fine-tuned on specific classical planning domains to generate high-quality plans that are both fluent and feasible. Plansformer takes the domain and problem files as input (in PDDL) and outputs a sequence of actions that can be executed to solve the problem. We demonstrate the effectiveness of Plansformer on a variety of benchmark problems and provide both qualitative and quantitative results obtained during our evaluation, including its limitations. Plansformer has the potential to significantly improve the efficiency and effectiveness of planning in various domains, from logistics and scheduling to natural language processing and human-computer interaction. In addition, we provide public access to Plansformer via a website as well as an API endpoint; this enables other researchers to utilize our tool for planning and execution. The demo video is available at https://youtu.be/_1rlctCGsrk
List of keywords
Planning and Scheduling -> PS: Learning in planning and scheduling
Natural Language Processing -> NLP: Language generation