Shengyao (Arvin) Zhuang 庄胜尧
I am a postdoctoral researcher at CSIRO, Australian e-Health Research Centre, where I focus on developing large language model-based search engine systems in the medical domain. Before joining CSIRO, I was a Ph.D. student at the ielab, EECS, The University of Queensland, Australia, supervised by Professor Guido Zuccon. My primary research interests lie in information retrieval, large language model-based neural rankers, and NLP in general.
Publications
2024
Large Language Models for Stemming: Promises, Pitfalls and Failures
Shuai Wang, Shengyao Zhuang, and Guido Zuccon.
Published in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24), 2024. Short paper
Leveraging LLMs for Unsupervised Dense Retriever Ranking
Ekaterina Khramtsova, Shengyao Zhuang (equal contribution), Mahsa Baktashmotlagh, and Guido Zuccon ( Best Paper Honorable Mention Award )
Published in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24), 2024. Full paper
Understanding and Mitigating the Threat of Vec2Text to Dense Retrieval Systems
Shengyao Zhuang, Bevan Koopman, Xiaoran Chu, and Guido Zuccon.
Published in Proceedings of the 2st International ACM SIGIR Conference on Information Retrieval in the Asia Pacific (SIGIR-AP ’24), 2024, 2024. Full paper
Starbucks: Improved Training for 2D Matryoshka Embeddings
Shengyao Zhuang, Shuai Wang, Bevan Koopman, and Guido Zuccon.
Published in Arxiv, 2024. Full paper
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval
Shengyao Zhuang, Xueguang Ma, Bevan Koopman, Jimmy Lin, and Guido Zuccon.
Published in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. Full paper
2023
Teaching Pre-Trained Language Models to Rank Effectively, Efficiently, and Robustly
Shengyao Zhuang.
Published in The University of Queensland, 2023. Thesis
Typos-aware Bottlenecked Pre-Training for Robust Dense Retrieval
Shengyao Zhuang, Linjun Shou, Jian Pei, Ming Gong, Houxing Ren, Guido Zuccon and Daxin Jiang.
Published in Proceedings of the 1st International ACM SIGIR Conference on Information Retrieval in the Asia Pacific (SIGIR-AP ’23), 2023. Full paper
A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models
Shengyao Zhuang, Honglei Zhuang, Bevan Koopman and Guido Zuccon.
Published in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24), 2023. Full paper
Selecting which Dense Retriever to use for Zero-Shot Search
Ekaterina Khramtsova, Shengyao Zhuang, Mahsa Baktashmotlagh, Xi Wang and Guido Zuccon.
Published in Proceedings of the 1st International ACM SIGIR Conference on Information Retrieval in the Asia Pacific (SIGIR-AP ’23), 2023. Full paper
Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval
Shengyao Zhuang, Linjun Shou, Guido Zuccon
Published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23), 2023. Short paper
Exploring the Representation Power of SPLADE Models
Joel Mackenziem, Shengyao Zhuang (equal contribution), Guido Zuccon
Published in Proceedings of the 2023 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’23), 2023. Short paper
Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models
Guido Zuccon, Harrisen Scells, Shengyao Zhuang
Published in Proceedings of the 2023 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’23), 2023. Short paper
Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking
Shengyao Zhuang, Bing Liu, Bevan Koopman and Guido Zuccon.
Published in Findings of the Association for Computational Linguistics: EMNLP 2023, 2023. Short paper
Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation
Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon and Daxin Jiang.
Published in The First Workshop on Generative Information Retrieval at SIGIR2023, 2023. Full paper
2022
To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers
Hang Li, Shuai Wang, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon
Published in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22), 2022. Short paper
Reduce, Reuse, Recycle: Green Information Retrieval Research
Harry Scells, Shengyao Zhuang, Guido Zuccon ( Best Paper Honorable Mention Award )
Published in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22), 2022. Perspective paper
Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach
Shengyao Zhuang, Hang Li and Guido Zuccon
Published in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22), 2022. Full paper
CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos
Shengyao Zhuang, Guido Zuccon
Published in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22), 2022. Full paper
Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during Training
Shengyao Zhuang, Guido Zuccon
Published in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22), 2022. Demo paper
Reinforcement Online Learning to Rank with Unbiased Reward Shaping
Shengyao Zhuang, Zhihao Qiao, Guido Zuccon
Published in Information Retrieval Journal (IRJ), 2022. Journal paper
Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study
Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon
Published in Proceedings of the 44th European Conference on Information Retrieval (ECIR ’22), 2022. Reproducibility paper
Pseudo-Relevance Feedback with Dense Retrievers in Pyserini
Hang Li, Shengyao Zhuang, Xueguang Ma, Jimmy Lin, Guido Zuccon
Published in Proceedings of the 26th Australasian Document Computing Symposium (ADCS ’22), 2022. Demo paper
Robustness of Neural Rankers to Typos: A Comparative Study
Shengyao Zhuang, Xinyu Mao, Guido Zuccon ( Best Paper Award )
Published in Proceedings of the 26th Australasian Document Computing Symposium (ADCS ’22), 2022. Short paper
Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls
Hang Li, Ahmed Mourad, Shengyao Zhuang, Bevan Koopman, Guido Zuccon
Published in Transactions on Information Systems (TOIS), 2022. Journal paper
2021
Fast Passage Re-ranking with Contextualized Exact Term Matching and Efficient Passage Expansion
Shengyao Zhuang, Guido Zuccon
Published in arxiv preprint, 2021. Full paper
TILDE: Term Independent Likelihood moDEl for Passage Re-ranking
Shengyao Zhuang, Guido Zuccon
Published in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), 2021. Full paper
How do Online Learning to Rank Methods Adapt to Changes of Intent?
Shengyao Zhuang, Guido Zuccon
Published in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), 2021. Full paper
BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval
Shuai Wang, Shengyao Zhuang, Guido Zuccon
Published in Proceedings of the 2021 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’21), 2021. Full paper
Effective and Privacy-preserving Federated Online Learning to Rank
Shuyi Wang, Bing Liu, Shengyao Zhuang, Guido Zuccon ( Best Student Paper Award )
Published in Proceedings of the 2021 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’21), 2021. Full paper
Dealing with Typos for BERT-based Passage Retrieval and Ranking
Shengyao Zhuang, Guido Zuccon
Published in In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. Short paper
Federated Online Learning to Rank with Evolution Strategies: A Reproducibility Study
Shuyi Wang, Shengyao Zhuang, Guido Zuccon
Published in Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, 2021. Reproducibility paper
Deep Query Likelihood Model for Information Retrieval
Shengyao Zhuang, Hang Li, Guido Zuccon
Published in Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, 2021. Short paper
2020
Counterfactual Online Learning to Rank
Shengyao Zhuang, Guido Zuccon
Published in Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, 2020. Full paper