Hi, greetings from Zeyu. I am midway through my Ph.D. studies, supervised by Prof. Sebastian Schelter, Dr. Iacer Calixto and Prof. Paul Groth. My research sits at the intersection of machine learning and relational data management. Right now, I’m especially interested in building foundation models that can make better sense of structural data. I’m also exploring how to make these models both efficient and usable, so they scale well to real-world complex systems.
I am open to any kind of connection and collaboration.
") does not match the recommended repository name for your site ("
").
", so that your site can be accessed directly at "http://
".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}
" in index.html
.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter
International Conference on Extending Database Technology (EDBT) 2025
We propose a new challenge that integrates two practical constraints into conventional entity matching (EM) tasks to better align with real-world deployment scenarios. A comprehensive evaluation of eight matching methods across 11 datasets provides key insights into model selection and data profiling.
Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter
International Conference on Extending Database Technology (EDBT) 2025
We propose a new challenge that integrates two practical constraints into conventional entity matching (EM) tasks to better align with real-world deployment scenarios. A comprehensive evaluation of eight matching methods across 11 datasets provides key insights into model selection and data profiling.
Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter
GoodData Workshop, AAAI 2025
We introduce AnyMatch, a novel framework for building effective and efficient entity matching systems. AnyMatch leverages small language models and borrows idea from instruction tuning to diversify the training corpus, refining the model through multiple data selection strategies. The GPT-2 variant of AnyMatch ranks second among baseline models, achieving an F1 score only 4.4$\%$ lower than GPT-4 in a zero-shot setting, while reducing costs by a factor of 3,899.
Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter
GoodData Workshop, AAAI 2025
We introduce AnyMatch, a novel framework for building effective and efficient entity matching systems. AnyMatch leverages small language models and borrows idea from instruction tuning to diversify the training corpus, refining the model through multiple data selection strategies. The GPT-2 variant of AnyMatch ranks second among baseline models, achieving an F1 score only 4.4$\%$ lower than GPT-4 in a zero-shot setting, while reducing costs by a factor of 3,899.