Hi, my name is

Lukas Laskowski.

I am a PhD student @ HPI.

PhD Student at the Hasso Plattner Institute, advised by Prof. Felix Naumann. My research focuses on data cleaning, AI, and computer vision applications for wildlife conservation.

Experience

Doctoral Researcher - HPI
Nov 2023 - present
My research, sponsored by SAP SE, focuses on developing ontology learning systems that transform structured data into meaningful knowledge representations. I’m particularly interested in the intersection of data management, AI, and semantic technologies.
Visting Researcher - University of California Irvine
Feb 2025 - May 2025
I’m currently conducting a research visit at UC Irvine, working with Prof. Padhraic Smyth. Together, we’re developing a human-in-the-loop ontology learning system that leverages a sequence-to-sequence (seq2seq) architecture. The goal is to integrate expert feedback into the learning loop for more accurate and interpretable knowledge extraction.
AI Engineer and Technical Advisor - Wildlife Conservation
Feb 2024 - present

I conduct research on wildlife re-identification AI technologies for a species conservation startup based in East-Central Africa. The work focuses on applying computer vision to support the monitoring and protection of endangered animals in the wild.

  • Published a paper on the use of AI for wildlife conservation (link).
Software Engineer (Working Student) - Siemens Energy
2018 - 2023

I designed and developed the central automated edge provisioning solution for a global Industrial IoT platform (“Connected Factory”), actively contributing to the core implementation.

  • Realization of the new global platform for Industrial IoT to connect industrial plants to the cloud (“Connected Factory”)
  • Responsible for the system architecture and development of the central automated edge provisioning solution
  • Technical management of external project staff in Europe and India
  • Planning the introduction of the cloud platform in over 80 factories
Cloud Operations Engineer (Intern) - SAP SE
Aug 2021 - Sep 2021
I administered services on the SAP Cloud Platform and developed automated deployment pipelines to streamline delivery processes. Additionally, I worked on the development and optimization of ABAP components to enhance system performance and reliability.

Education

2021 - 2023
M.Sc. Data Engineering
Hasso Plattner Institute, Potsdam
GPA: 3.9 out of 4.0

Extracurricular Activities

  • Co-Founder of the HPI Entrepreneurship Club
  • Thesis: NumbER – Entity Resolution on Numerical Data
2018 - 2021
B.Sc. IT-Systems Engineering
Hasso Plattner Institute, Potsdam
GPA: 3.5 out of 4.0

Published Bachelor thesis on top tier database conference:

Selected Publications

Frost: A Platform for Benchmarking and Exploring Data Matching Results
VLDB 2022 Benchmarking Data Matching
Frost: A Platform for Benchmarking and Exploring Data Matching Results
We introduce Frost, the first platform to systematically evaluate and explore data deduplication solutions by combining quality, cost, and effort metrics with interactive result exploration—addressing the limitations of existing benchmarks that focus solely on matching accuracy.
Prisma: A Privacy-Preserving Schema Matcher using Functional Dependencies
EDBT 2025 Schema Matching Functional Dependencies
Prisma: A Privacy-Preserving Schema Matcher using Functional Dependencies
Prisma introduces a novel approach to schema matching that leverages functional dependencies and graph embeddings to identify column correspondences without relying on name or data similarity. Prisma outperforms existing methods, particularly in multi-table databases with significantly different or encrypted encodings.
GorillaVision – Open-Set Re-Identification of Wild Gorillas
Workshop Camera Traps, AI and Ecology Computer Vision Wildlife Conservation
GorillaVision – Open-Set Re-Identification of Wild Gorillas
This work introduces an open-set re-identification system for gorillas in the wild, combining YOLOv7 face detection with a Vision Transformer-based embedding model trained using Triplet Loss. By classifying embeddings with a k-nearest neighbors algorithm, the system enables reliable identification of unseen individuals, achieving 60–90% accuracy depending on dataset quality and population size.

Highlights

Research Visit at UC Irvine Feb - May 2025
I'm excited to be conducting a research visit at UC Irvine, working with Prof. Padhraic Smyth on human-in-the-loop ontology learning. Our work combines methods from databases, the Semantic Web, and AI to create more interactive and adaptive knowledge systems.
Invited Talk at MIT
Had the opportunity to present and discuss my research on entity resolution on numerical data, as well as my upcoming work on ontology engineering.
Paper Talk at VLDB 2022 in Sydney
Presented our research at the 48th International Conference on Very Large Data Bases (VLDB 2022) in Sydney. The talk focused on our paper 'Frost: A Platform for Benchmarking and Exploring Data Matching Results'.
Talk at AWS in Zurich 2022
Presented our new Industrial IoT solution from Siemens Energy at an AWS event in Zurich, alongside Michael Brunklaus and Mario Pilz. The talk focused on connecting industrial labs and factories to the cloud.
Vice President of Vfb Hermsdorf - Tennis from 2022 - today
Elected as Vice President of VfB Hermsdorf - Tennis, a local tennis club in Berlin.
Talk at SAP Signavio VAD Knowledge Club
Presented my ongoing PhD research on building a system that automatically integrates enterprise data into a unified knowledge graph. The talk highlighted how this approach enables high-quality AI applications by bridging structured data and semantic technologies.