Tutorials

Tutorial T1: A Practical Introduction to Data Wallets for Privacy-First Analytics and AI

Presenters: Anushka Vidanage, Graham Williams, Dawei Chen, Sergio J. Rodríguez Méndez

Authors: Anushka Vidanage, Graham Williams, Dawei Chen, Sergio J. Rodríguez Méndez

Affiliations: The Australian National University

Duration: 2 hours

Abstract:
Data remains the foundation for all that we do across knowledge discovery, data science, machine learning and AI. The centralized collection of data through cloud services and international companies has provided a boon for today’s AI advances. Yet this data belongs to the individuals to whom the data relates. Society in general is beginning to realize the dangers of the traditional approach of having individual’s data centrally collected through social media, online shopping, and more. The collection of this increasingly complex and sensitive data has led to it being vulnerable to security and privacy breaches. Increasingly we are recognizing the need for a paradigm shift in how data is collected, secured, controlled, and stored. The concept of a personal data wallet or a personal online datastore (Pod) hosted on Solid (social linked data) servers is emerging as that paradigm shift. Personal data is stored in a distributed manner on the cloud but with individuals having access controls (self-sovereignty) over their own data is key as society continues to be impacted by AI developments. The open Solid specification enables the hosting and sharing of Pods on servers, under an individual’s control. The new paradigm sees an ecosystem of apps through which individuals collect, control, store, and manage their own data hosted within their data wallet. An individual has complete control over with whom, when, and how they share their data and even where their data is hosted. They can then choose to participate in the AI revolution to suit their own needs. In this tutorial, we will deliver: (1) a comprehensive discussion of the technology behind Solid Pods, (2) how we can utilize Solid Pods to provide fine-grain access control and consent mechanisms for data sharing, and (3) provide hands-on activities for storing and sharing personal data between Pods and developing apps on a Solid Pods architecture.

Tutorial T2: AI Ethics Assurance, AI Explainability, and AI Security: From Principles to Practice

Presenters: Jianlong Zhou, Zhiyu Zhu, Zhibo Jin

Authors: Jianlong Zhou, Zhiyu Zhu, Zhibo Jin

Affiliations: University of Technology Sydney

Duration: 2 hours

Abstract:
This tutorial aims to provide a comprehensive understanding of key concerns in AI technologies, focusing on AI ethics, AI explainability, and AI security. The focus on these three areas stems from the critical need to ensure that AI systems are trustworthy, transparent, and secure in real-world applications. AI Ethics Assurance is essential because there is still a challenge in operationalizing these principles in practical applications. This tutorial will explore various approaches to provide practical methods for assuring AI ethics. AI Explainability is crucial for deploying trustworthy AI systems, especially in high-stakes domains such as medical diagnosis and autonomous decision-making. The tutorial will introduce the Attribution-Based Explainability (ABE) framework, which addresses key limitations of existing solutions by integrating state-of-the-art algorithms and ensuring compliance with core attribution axioms. AI Security is vital due to the evolving nature of adversarial threats that pose significant vulnerabilities to AI models. The tutorial will cover systematic benchmarking of attack transferability and the development of robust defense mechanisms. It will introduce TAA-Bench, a benchmark framework for evaluating attack transferability across different deep learning models. Besides the introduction of state-of-the-art theories in AI ethics, AI explainability, and AI security, the tutorial will offer hands-on practices and demonstrations, providing participants with practical guides to apply these theories in their work. By the end of this tutorial, participants will have a deeper understanding of how to enhance the transparency, trustworthiness, and security of AI systems.

Tutorial T3: Practical Online Continual Learning

Presenters: Heitor Murilo Gomes, Anton Lee, Nuwan Gunasekara

Authors: Heitor Murilo Gomes, Anton Lee, Nuwan Gunasekara

Affiliations: Victoria University of Wellington, Halmstad University

Duration: 2 hours

Abstract:
Online Continual Learning (OCL) enables machine learning models to learn sequentially from non-stationary data while maintaining past knowledge. This tutorial presents a practical guide to OCL, covering key concepts, challenges such as catastrophic forgetting and stability-plasticity trade-offs, and recent advances including prototype-based methods and prompt-based approaches. The session includes hands-on demonstrations using CapyMOA, an open-source platform for online/stream/continual learning. By bridging research insights with practical implementation, this tutorial aims to equip attendees with the tools to develop robust OCL solutions.

Tutorial T4: AI and Data Mining in the Era of Trust and Security

Presenters: Guanfeng Liu, Xuyun Zhang, Zijian Ying

Authors: Guanfeng Liu, Xuyun Zhang, Zijian Ying

Affiliations: Macquarie University & Nanjing University of Science and Technology

Duration: 2 hours

Abstract:
In an era where artificial intelligence (AI) and data mining play a crucial role in decision-making, ensuring trust, privacy, and security in AI-driven data analytics has become paramount. This tutorial will provide an in-depth exploration of techniques and methodologies for developing secure and privacy-preserving AI models in data mining. We will cover essential topics such as federated learning, differential privacy, adversarial robustness, and trustworthy AI frameworks. Through case studies and real-world applications, participants will gain knowledge on how to develop secure and reliable AI-driven data mining models.

Tutorial T5: Developing Data-Driven Automated Negotiating Agents

Presenters: Yasser Mohammad

Authors: Yasser Mohammad

Affiliations: NEC CORPORATION, National Institute of Advanced Industrial Science and Technology

Duration: 2 hours

Abstract:
This tutorial introduces attendees to the problem of building effective negotiation strategies machine learning and data-driven methods. After providing the motivation for this problem, the tutorial presents the needed theoretical background about Automated Negotiation (AN). We then present recent advances in applying machine learning and data-driven methods for building negotiation strategies, learning opponent models, learning bidding and acceptance strategies and adapting to changes in utility functions. Based on this background, the tutorial then introduces a unifying framework that can be used to represent most existing research in ML and RL for automated negotiation as well as new solutions not yet attempted. The attendees will then learn through a live demonstration (with optional follow-along) how to use this framework to represent, solve and evaluate the solution of a specific problem of applying reinforcement learning the automated negotiation.

Tutorial T6: Advanced Techniques for Automated Diagnosis of Cardiac Diseases

Presenters: Moomal Farhad, Mohammad Mehedy Masud

Authors: Moomal Farhad, Mohammad Mehedy Masud

Affiliations: United Arab Emirates University

Duration: 1.5 hours

Abstract:
Automating the diagnosis of cardiac diseases through advanced technologies is crucial to improving healthcare precision and efficiency. This tutorial introduces cutting-edge deep learning and data-efficient methods specifically designed for diagnosing cardiac diseases. It provides a comprehensive overview of the application of these techniques in the broader medical field, followed by a detailed exploration of their specific relevance to cardiac diseases. The tutorial addresses challenges such as limited data availability due to privacy regulations, diverse medical standards, and legal constraints. It explores techniques like contrastive learning, Siamese networks, and zero-shot or few-shot learning, showcasing their effectiveness in diagnosing diseases with limited datasets. Real-life examples are used to demonstrate how these methods enhance medical image classification and diagnosis. The tutorial also examines the involvement of medical professionals in the automated diagnostic process and discusses the current obstacles and future prospects for advancing medical AI in the treatment of cardiac conditions.

Tutorial T7: Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations

Presenters: Dong Li, Chen Zhao, Xintao Wu

Authors: Dong Li, Guihong Wan, Xintao Wu, Xinyu Wu, Ajit Nirmal, Christine Lian, Peter Sorger, Yevgenly Semenov, Chen Zhao

Affiliations: Baylor University, Harvard Medical School, University of Arkansas

Duration: 1.5 hours

Abstract:
Computational Pathology Foundation Models (CPathFMs) have emerged as a transformative approach for automating histopathological analysis by leveraging self-supervised learning on large-scale, unlabeled whole-slide images (WSIs). These models, categorized into uni-modal and multi-modal frameworks, facilitate tasks such as segmentation, classification, biomarker discovery, and prognosis prediction. However, the development of CPathFMs faces significant challenges, including limited dataset availability, domain-specific adaptation requirements, and the absence of standardized evaluation benchmarks. This tutorial will provide a comprehensive overview of the current state of CPathFMs, covering key datasets, adaptation strategies such as contrastive learning and multi-modal integration, and a taxonomy of evaluation tasks. We will discuss how these models are trained, fine-tuned, and assessed, addressing the critical gaps in generalization, bias mitigation, and clinical applicability. Additionally, we will explore emerging research directions in fairness, transparency, security, and standardization of evaluation protocols. This tutorial will serve as an essential resource for researchers, clinicians, and AI practitioners looking to advance the field of AI-driven computational pathology.

Tutorial T8: Uncertainty Quantification in Neural Networks and Uncertainty-Aware Decision Making in Supervised Learning

Presenters: Amir H. Gandomi, Hassan Gharoun

Authors: Amir H. Gandomi, Hassan Gharoun

Affiliations: University of Technology Sydney, Óbuda University

Duration: 1.5 hours

Abstract:
Understanding and quantifying uncertainty in neural networks is crucial for developing reliable and robust AI systems. In machine learning, uncertainty arises from various sources, including data noise (aleatoric uncertainty) and model limitations (epistemic uncertainty). Uncertainty quantification involves identifying and measuring these uncertainties to enhance the predictive confidence and decision-making capabilities of neural networks. A prevalent misunderstanding is that the probability values output by neural networks, typically normalized using the Softmax function, accurately measure model confidence. These values might appear as class probabilities but often do not reflect the model’s true certainty. This tutorial will provide an in-depth definition of uncertainty quantification, discuss its importance, and explore methods such as Bayesian neural networks, Monte Carlo dropout, and ensemble techniques. This tutorial will cover the theoretical foundations, practical implementations, and applications of these methods, highlighting their significance in improving model reliability, and performance. Participants will learn how to incorporate uncertainty estimates into neural networks and utilize them for uncertainty-aware decision-making processes. By the end of the tutorial, attendees will gain a comprehensive understanding of current techniques and practical insights to implement uncertainty quantification in their neural network projects.

Tutorial T9: Knowledge Discovery from Unstructured Text Documents with GenAI

Presenters: Richi Nayak, Md Abul Bashar, Joy Wang, Duoyi Zhang

Authors: Richi Nayak, Md Abul Bashar, Joy Wang, Duoyi Zhang

Affiliations: Queensland University of Technology

Duration: 1.5 hours

Abstract:
In recent years, Generative Artificial Intelligence (GenAI) has emerged as a transformative force in the field of knowledge discovery. GenAI techniques, ranging from foundational frameworks such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) to advanced Large Language Models (LLMs), have revolutionized the way researchers and practitioners approach data analysis and interpretation. Unlike discriminative methods, which focus on generating decision boundaries for input data, GenAI aims to model the full distribution space of inputs, such as unstructured text data. This comprehensive capture of the input data space enables more robust and versatile performance across a wide range of knowledge discovery tasks. This tutorial offers a systematic exploration of the impact of GenAI on knowledge discovery for unstructured text documents. We begin by introducing foundational GenAI models, covering both theoretical frameworks and practical implementations of large language models. Subsequently, we delve into their applications and the challenges associated with their use in knowledge discovery tasks. The tutorial is structured into three main parts, each focusing on a specific knowledge discovery task: The first part addresses information extraction from open-ended mining reports provided by the Queensland Department of Resources (DoR). We discuss how GANs and pre-trained language models can be leveraged for semi-supervised information extraction, highlighting their advantages and limitations. The second part focuses on topic modeling using VAEs, exploring their applications in diverse domains such as COVID-19 trend analysis and cyber threat identification. Additionally, we introduce multimodal topic modeling, an emerging trend in the GenAI era. The final part examines knowledge summarization with GenAI, presenting a framework that integrates topic modeling and LLMs to achieve effective summarization. Throughout the tutorial, we emphasize current research gaps and open questions in the field, providing insights into potential directions for future research. This tutorial aims to equip participants with a deeper understanding of GenAI’s capabilities and its transformative potential in knowledge discovery.

Tutorial T10: Heuristics and Meta-Heuristics for Social Networks Privacy Preserving: Concepts and Applications

Presenters: Navid Yazdanjue, Amir H. Gandomi

Authors: Navid Yazdanjue, Amir H. Gandomi

Affiliations: University of Technology Sydney, Óbuda University

Duration: 1.5 hours

Abstract:
The rapid expansion of online social networks (OSNs) has led to an unprecedented surge in user-generated data, often shared with third parties for research and analysis. However, this data contains sensitive attributes, such as identifiers, quasi-identifiers, and confidential information, posing significant privacy risks. Adversaries can exploit the structural properties of OSNs to re-identify users. A well-documented example is the Netflix Prize dataset, where deidentified user ratings were linked with external sources, revealing personal identities. Such privacy breaches highlight the risks of inadequate privacy preserving in OSNs and necessitating robust privacy-preserving techniques. Among these, anonymization has emerged as a widely adopted approach, transforming original social network data into obfuscated versions that protect user privacy. However, this process introduces a critical trade-off between privacy and utility, while enhanced privacy reduces the risk of user identification, it may also lead to significant information loss, thereby diminishing the data’s value for analysis. This tutorial explores the application of meta-heuristics and heuristics as effective solutions to address this challenge of anonymization in OSNs. Participants will gain a comprehensive understanding of how these advanced optimization techniques can balance the trade-off between privacy and utility by efficiently altering the network structure. Through an in-depth exploration of principles, practical methodologies, and conducted research studies, this tutorial provides a roadmap for developing privacy-preserving methods for online social networks. We will also discuss emerging trends and future research directions, offering insights into how these techniques can evolve to meet the growing demands of privacy preservation in the digital age.

Tutorials

Tutorial T1: A Practical Introduction to Data Wallets for Privacy-First Analytics and AI

Tutorial T2: AI Ethics Assurance, AI Explainability, and AI Security: From Principles to Practice

Tutorial T3: Practical Online Continual Learning

Tutorial T4: AI and Data Mining in the Era of Trust and Security

Tutorial T5: Developing Data-Driven Automated Negotiating Agents

Tutorial T6: Advanced Techniques for Automated Diagnosis of Cardiac Diseases

Tutorial T7: Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations

Tutorial T8: Uncertainty Quantification in Neural Networks and Uncertainty-Aware Decision Making in Supervised Learning

Tutorial T9: Knowledge Discovery from Unstructured Text Documents with GenAI

Tutorial T10: Heuristics and Meta-Heuristics for Social Networks Privacy Preserving: Concepts and Applications

PAKDD 2025

Contact Us