{"id":43003,"date":"2025-07-13T09:32:40","date_gmt":"2025-07-13T13:32:40","guid":{"rendered":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/"},"modified":"2025-07-13T22:28:26","modified_gmt":"2025-07-14T02:28:26","slug":"bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025","status":"publish","type":"post","link":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/","title":{"rendered":"Bloomberg&#8217;s AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025"},"content":{"rendered":"<div class='bbg-row bbg-bg--white  bbg-row--margin-top-none bbg-row--margin-bottom-none' data-anchor='row-69fbd109355f9'>\n  \n\t\n\t\n\t<div class=\"bbg-row--content\">\n\t\t\n\t\t\t<div class='bbg-column bbg-column--width-8 bbg-column--offset-2'>\n\t<div class='bb-wysiwyg'>\n    \n    <p><span style=\"font-weight: 400;\">During the <\/span><a href=\"https:\/\/sigir2025.dei.unipd.it\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">48th International ACM SIGIR Conference<\/span><\/a><span style=\"font-weight: 400;\"> (SIGIR 2025) in Padua, Italy this week (July 13-17, 2025), researchers from Bloomberg\u2019s AI Engineering Group are showcasing their expertise in information retrieval (IR) by publishing three papers at the conference.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">IR is the process of finding and accessing relevant information from a collection of data, such as documents, images or databases, in response to a user&#8217;s query. IR is critical for combating information overload and facilitating access to knowledge in a world full of data. AI is revolutionizing the field, and research presented at SIGIR will be applied toward making information retrieval systems more robust, effective, and efficient.<\/span><\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1280w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 280w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1280w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 280w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p><span style=\"font-weight: 400;\">In addition, Bloomberg AI Research Scientist Shuo Zhang is one of the organizers of the <\/span><a href=\"https:\/\/finir2025.github.io\/\"><span style=\"font-weight: 400;\">2nd Workshop on Financial Information Retrieval in the Era of Generative AI<\/span><\/a><span style=\"font-weight: 400;\"> (FinIR) on July 17, 2025. The workshop serves as a forum for researchers and practitioners to explore potential approaches and research directions to address the challenges of using generative AI in finance through the use of advanced IR techniques. Its goal is to deepen understanding, accelerate progress, and support the advancement of IR technology to enhance generative models to address financial challenges.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We asked some of the authors to summarize their research and explain why the results were notable:<\/span><\/p>\n<hr \/>\n<h3 style=\"text-align: center;\"><strong><u>Monday, July 14, 2025<\/u><\/strong><\/h3>\n<p><em>Reproducibility: Domain-Specific, Multimodal, and Multilingual Retrieval<\/em><br \/>\n<em>10:30-12:30 CEST<\/em><\/p>\n<p><a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/3726302.3730290\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\"><strong>Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study<\/strong><\/span><\/a><br \/>\n<span style=\"font-weight: 400;\">Mariya Hendriksen (University of Amsterdam), Shuo Zhang (Bloomberg), Ridho Reinanda (Bloomberg), Mohamed Yahya (Bloomberg), Edgar Meij (Bloomberg), Maarten de Rijke (University of Amsterdam)<\/span><\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"707\" height=\"948\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Click to read &quot;Benchmark Granularity and Model Robustness for Image-Text Retrieval,&quot; published July 13, 2025 at SIGIR 2025\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 707w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 224w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 142w\" sizes=\"(max-width: 707px) 100vw, 707px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"707\" height=\"948\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Click to read &quot;Benchmark Granularity and Model Robustness for Image-Text Retrieval,&quot; published July 13, 2025 at SIGIR 2025\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 707w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 224w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Reproducibility-paper.png 142w\" sizes=\"(max-width: 707px) 100vw, 707px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p><strong>Please summarize your research. Why are your results notable?<\/strong><\/p>\n<p><strong>Shuo Zhang:<\/strong> In our study, we examined how well today\u2019s AI models can match images with text descriptions \u2014 a technology behind many modern search and recommendation systems. While these systems are widely used, most are only tested with short, simple captions that don\u2019t reflect how people actually search in real life. Our work looked at two often-overlooked factors: the level of detail in descriptions (caption \u201cgranularity\u201d) and how well these systems handle errors or unexpected input (\u201crobustness\u201d). See Figure 1 as an illustration.<\/p>\n<p>Granularity refers to how much detail is present in a caption. A simple, coarse caption might be \u201ca red rose is sitting next to a couple of mugs,\u201d while a fine-grained version could be \u201ca single red rose in a glass vase sits on a dining table, next to two ceramic mugs with floral designs.\u201d<\/p>\n<p>Robustness (illustrated below), on the other hand shows how even a small typo (\u201ccouple\u201d vs. \u201ccoupel,\u201d or \u201cmotorcycles\u201d vs. \u201comtorcycles\u201d) can increase or decrease the quality of retrieved images. This highlights the importance of evaluating models under realistic, imperfect conditions, not just with idealized queries.<\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"1676\" height=\"1458\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Examples of perturbation effects on R@1 (Recall@1) for image retrieval\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1676w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 218w\" sizes=\"(max-width: 1676px) 100vw, 1676px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"1676\" height=\"1458\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Examples of perturbation effects on R@1 (Recall@1) for image retrieval\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1676w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image2.png 218w\" sizes=\"(max-width: 1676px) 100vw, 1676px\" \/>\n    <figcaption class='image-figure__caption'><span style=\"font-weight: 400\">Examples of perturbation effects on R@1 (Recall@1) for image retrieval<\/span><\/figcaption>\n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p><span style=\"font-weight: 400;\">We evaluated four of the most advanced vision-language models, <\/span><a href=\"https:\/\/huggingface.co\/docs\/transformers\/model_doc\/clip\" target=\"_blank\" rel=\"noopener\"><b>CLIP<\/b><\/a><b>, <\/b><a href=\"https:\/\/huggingface.co\/docs\/transformers\/en\/model_doc\/align\" target=\"_blank\" rel=\"noopener\"><b>ALIGN<\/b><\/a><b>, <\/b><a href=\"https:\/\/huggingface.co\/docs\/transformers\/en\/model_doc\/altclip\" target=\"_blank\" rel=\"noopener\"><b>AltCLIP<\/b><\/a><b>, <\/b><span style=\"font-weight: 400;\">and <\/span><a href=\"https:\/\/huggingface.co\/docs\/transformers\/en\/model_doc\/groupvit\" target=\"_blank\" rel=\"noopener\"><b>GroupViT<\/b><\/a><span style=\"font-weight: 400;\">, on both standard datasets and new versions we created, which include richer, more descriptive captions. To test robustness, we introduced a wide range of common mistakes, \u2014 like word order changes, typos, and synonyms \u2014 into the search queries, and measured how each model performed.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Key Findings<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"><strong>Detailed descriptions lead to better results:<\/strong> Models found correct images much more often when given detailed, specific captions \u2014 up to 16% better in some cases.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"><strong>Varying robustness:<\/strong> Models responded differently to different kinds of input \u201cnoise.\u201d Some errors, like rearranged word order, had a major impact; others, like small word changes, were less disruptive or even occasionally improved performance.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"><strong>Order matters:<\/strong> Contrary to previous beliefs, these AI models are sensitive to the order of words in a description, especially with longer or more detailed captions.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"><strong>A new evaluation toolkit:<\/strong> We built a new testing framework that allows others to measure model performance using more realistic, challenging scenarios that reflect actual user behavior.<\/span><\/li>\n<\/ul>\n<p><b>How does your research advance the state-of-the-art in the field of information retrieval?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Most current image search systems are tested only under ideal conditions, with simple, short, and general captions \u2014 for example, \u201ca person walking a dog.\u201d These \u201ccoarse\u201d captions mention only the main objects or actions and leave out specifics like the dog\u2019s breed, what the person is wearing, or the location. In contrast, real users often type much more detailed and sometimes imperfect queries, such as, \u201ca woman in a yellow raincoat walking a black labrador on a leash through a city park in autumn, with leaves on the ground.\u201d This mismatch means existing AI systems may not be as reliable as we think when deployed in real-world situations. Our research directly addresses this gap.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>More accurate search:<\/b><span style=\"font-weight: 400;\"> For industries that rely on finding the right image quickly, such as newsrooms, digital asset management, and e-commerce, improved search accuracy saves time and reduces frustration.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Greater reliability:<\/b><span style=\"font-weight: 400;\"> Our findings help ensure AI tools don\u2019t break down when faced with everyday user input, improving user trust and satisfaction.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Higher standards:<\/b><span style=\"font-weight: 400;\"> By introducing more realistic testing, we set a new bar for how image-text AI should be evaluated before being put into production.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For example, if someone needs to find a specific photo for a story and enters a long, detailed caption \u2014 even with a typo: \u201ca woman in a yellow raincaot walking a black labrador&#8230;\u201d) \u2014 our research helps ensure that the right image can still be found. This is crucial in fast-paced or high-stakes environments. To support progress across the industry, we are <\/span><a href=\"https:\/\/github.com\/bloomberg\/evaluating-cmr-in-mm\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">making our tools publicly available<\/span><\/a><span style=\"font-weight: 400;\">, so others can continue improving the reliability and usefulness of AI search technology.<\/span><\/p>\n\n<\/div>\n<div class=\"bb-separator\" data-color=\"\">\n\t<hr class=\"bb-separator__rule\">\n<\/div>\n<div class='bb-wysiwyg'>\n    \n    <h3 style=\"text-align: center;\"><strong><u>Tuesday, July 15, 2025<\/u><\/strong><\/h3>\n<p><em>Short Paper Posters 2<\/em><br \/>\n<em>14:00-15:30 CEST<\/em><br \/>\n<a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/3726302.3730163\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\"><strong>An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-Doc<\/strong><\/span><\/a><br \/>\n<span style=\"font-weight: 400;\">Aldo Porco* (Bloomberg), Dhruv Mehra* (Bloomberg), Igor Malioutov* (Bloomberg), Karthik Radhakrishnan* (Bloomberg), Moniba Keymanesh* (Bloomberg), Daniel Preo\u0163iuc-Pietro (Bloomberg), Sean MacAvaney (University of Glasgow), Pengxiang Cheng (Bloomberg).<\/span><br \/>\n<em>(* equal contributions)<\/em><\/p>\n<p><em>This paper will also be presented on Thursday, July 17, 2025 at the <a href=\"https:\/\/reneuir.org\/\" target=\"_blank\" rel=\"noopener\">Workshop on Reaching Efficiency in Neural Information Retrieval<\/a> (ReNeuIR&#8217;25).<\/em><\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"931\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Click to read &quot;An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-Doc,&quot; published July 13, 2025 at SIGIR 2025\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 714w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 230w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 146w\" sizes=\"(max-width: 714px) 100vw, 714px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"931\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Click to read &quot;An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-Doc,&quot; published July 13, 2025 at SIGIR 2025\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 714w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 230w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-DF-FLOPS-paper.png 146w\" sizes=\"(max-width: 714px) 100vw, 714px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p><strong>Please summarize your research. Why are your results notable?<\/strong><\/p>\n<p><strong>Karthik Radhakrishnan:<\/strong> Our research deals with improving latency of retrieval models from the SPLADE-Doc family. Typically, these models are used to generate representations for passages indexed into search engines, such as Apache Solr. A big part of their retrieval latency is due to high frequency tokens that are assigned high weights in passage representations. As a result, for queries containing these high frequency tokens, a large number of documents are matched from the index, resulting in slow retrieval. The current training process for these models cannot handle this issue without a substantial drop in performance.<\/p>\n<p>We propose a new method of regularization called \u201cDF-FLOPS.\u201d The method enforces sparsity over both documents and terms, compared to the popular FLOPS regularization, which only does so within documents.<\/p>\n<p>Let\u2019s consider the representations produced with both FLOPS and DF-FLOPS for a sample document below. When using FLOPS, the expansions generated by the model often contain high frequency tokens that are semantically irrelevant (Tokens like \u201cis,\u201d \u201cwith\u201d etc.). These high frequency tokens highlighted in red below result in large search indices and slow retrieval.<\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"1764\" height=\"689\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Representation with FLOPS\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1764w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 280w\" sizes=\"(max-width: 1764px) 100vw, 1764px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"1764\" height=\"689\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Representation with FLOPS\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1764w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image1.png 280w\" sizes=\"(max-width: 1764px) 100vw, 1764px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p>With DF-FLOPS, the model learns to avoid relying on high frequency tokens while using them when appropriate. For example, in the text below, the word \u201cwho\u201d is used to represent the \u201cWorld Heath Organization,\u201d and keeping the token would be crucial for retrieval, whereas irrelevant tokens like \u201cis\u201d and \u201cwith\u201d are dropped.<\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"1755\" height=\"749\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Representation with DF-FLOPS\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1755w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 280w\" sizes=\"(max-width: 1755px) 100vw, 1755px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"1755\" height=\"749\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Representation with DF-FLOPS\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1755w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 1536w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/image3.png 280w\" sizes=\"(max-width: 1755px) 100vw, 1755px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p>With our method, we can lower retrieval latency of SPLADE-Doc by 10x to the level of BM25, making learned sparse retrieval practical for deployment in production-grade search engines.<\/p>\n<p>The performance of the retrieval system decreases slightly when tested in the same domain as training (- 2.2 MRR@10 on MS-MARCO), but it improves on 12\/13 cross-domain datasets from the BEIR benchmark.<\/p>\n<p><strong>How does your research advance the state-of-the-art in the field of information retrieval?<\/strong><\/p>\n<p>Semantic similarity applications are widespread and include search engines and retrieval-augmented generation (RAG)-based chatbots. These rely on representing documents as numerical vectors (\u201cembeddings\u201d) from which document similarity is computed. Learned sparse embeddings, such as those obtained from SPLADE, are a more practical alternative to dense embeddings, as they enable faster computation of similarities and can make use of existing sparse index solutions and infrastructure.<\/p>\n<p>The method described in our paper shows that the retrieval latency of SPLADE can be sped up by an order of magnitude, while preserving most of the quality improvements. These practical improvements are critical for real production retrieval systems.<\/p>\n\n<\/div>\n<div class=\"bb-separator\" data-color=\"\">\n\t<hr class=\"bb-separator__rule\">\n<\/div>\n<div class='bb-wysiwyg'>\n    \n    <h3 style=\"text-align: center;\"><strong><u>Thursday, July 17, 2025<\/u><\/strong><\/h3>\n<p><strong><a href=\"https:\/\/finir2025.github.io\/\" target=\"_blank\" rel=\"noopener\">FinIR: The 2nd Workshop on Financial Information Retrieval in the Era of Generative AI<\/a><br \/>\n<\/strong><span style=\"font-weight: 400;\">Fengbin Zhu (National University of Singapore), Yunshan Ma (Singapore Management University), Fuli Feng (University of Science and Technology of China), Chao Wang (6Estates Pte Ltd.), Huanbo Luan (6Estates Pte Ltd), Guangnan Ye (Fudan University), Shuo Zhang (Bloomberg), Dhagash Mehta (BlackRock), Pingping Chen (Goldman Sachs), Bing Xiang (Goldman Sachs), Tat-Seng Chua (National University of Singapore)<\/span><\/p>\n\n<\/div>\n<div class='bb-wysiwyg'>\n    \n    <p><b>Please explain the goal of this workshop. Why are you helping to organize it?<\/b><\/p>\n<p><b>Shuo Zhang<\/b><span style=\"font-weight: 400;\">: The 2nd Workshop on Financial Information Retrieval in the Era of Generative AI is designed to explore and address the emerging challenges at the intersection of generative AI and financial information retrieval. As generative models \u2014 particularly large language models (LLMs) \u2014 continue to revolutionize information access, their limitations become increasingly evident in fast-paced domains like finance. Issues like hallucination, outdated knowledge, and data sparsity make it clear that generative models cannot operate in isolation.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is where information retrieval (IR) plays a critical role. The workshop aims to advance the integration of IR technologies \u2014 such as retrieval-augmented generation (RAG), multimodal and real-time retrieval, and domain-specific query understanding \u2014 into generative systems for finance. It also tackles core concerns around benchmarking, privacy, trustworthiness, and evaluation frameworks unique to financial applications.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I am helping to organize this workshop because I believe the financial domain offers a uniquely demanding and richly complex environment for IR research. The combination of heterogeneous data, temporal sensitivity, and regulatory constraints requires sophisticated solutions that push the boundaries of both retrieval and generation. With my research focus on IR and text analytics, I\u2019m especially excited to help bridge the research gap and promote cross-pollination between academia and industry practitioners.<\/span><\/p>\n<p><b>How do you expect or hope that this workshop will help advance the state-of-the-art in the field of financial information retrieval?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">We hope this workshop will drive new progress in how IR is used in finance, especially when combined with generative AI models like LLMs. Financial data is complex, constantly changing, and comes in many forms \u2014 like tables, charts, reports, and news. Traditional generative models often struggle with this because they rely only on their training, and can&#8217;t always access the latest or most specific information.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is where IR becomes essential. The workshop encourages new methods to improve how we find and use relevant financial information \u2014 for example, retrieving updated data in real time, handling multiple types of data (like text and images), and refining user queries in financial contexts. These advances can make generative models more accurate, up-to-date, and useful in real-world financial applications.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We&#8217;re also focused on creating better ways to evaluate these systems, since current benchmarks don\u2019t reflect the real demands of financial tasks. The workshop supports building new benchmarks and testing tools that help researchers measure system performance more realistically.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Another goal is to support the design of complete systems, like financial assistants or tools for analysts,\u00a0 that combine retrieval and generation. Finally, we want to tackle trust and privacy issues, which are especially important in finance. This includes improving data security, avoiding misinformation, and ensuring systems are explainable and reliable.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">By bringing together researchers and professionals from both academia and industry, we hope this workshop sparks ideas, collaborations, and new research directions that raise the standard for how AI is applied in finance.<\/span><\/p>\n\n<\/div>\n<div class=\"bb-separator\" data-color=\"\">\n\t<hr class=\"bb-separator__rule\">\n<\/div>\n<div class='bb-wysiwyg'>\n    \n    <p><a href=\"https:\/\/arxiv.org\/abs\/2507.07906\" target=\"_blank\" rel=\"noopener\"><strong>Agentic Retrieval of Topics and Insights from Earnings Calls<\/strong><\/a><br \/>\nAnant Gupta (Bloomberg), Rajarshi Bhowmik (Bloomberg), Geoffrey Gunow (Bloomberg).<\/p>\n\n<\/div>\n<figure class=\"image-figure image-figure--has-small-image\" data-animation=\"\">\n    <img loading=\"lazy\" decoding=\"async\" width=\"1700\" height=\"2200\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png\" class=\"attachment-full size-full image-figure__image image-figure__image--primary\" alt=\"Click to read &quot;Agentic Retrieval of Topics and Insights from Earnings Calls,&quot; published July 17, 2025 at the 2nd Workshop on Financial Information Retrieval in the Era of Generative AI (FinIR 2025)\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1700w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 232w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 791w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1187w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1583w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 147w\" sizes=\"(max-width: 1700px) 100vw, 1700px\" \/><img loading=\"lazy\" decoding=\"async\" width=\"1700\" height=\"2200\" src=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png\" class=\"attachment-full size-full image-figure__image image-figure__image--small\" alt=\"Click to read &quot;Agentic Retrieval of Topics and Insights from Earnings Calls,&quot; published July 17, 2025 at the 2nd Workshop on Financial Information Retrieval in the Era of Generative AI (FinIR 2025)\" srcset=\"https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1700w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 232w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 791w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1187w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 1583w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&amp;type=webp&amp;url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/2507.07906v1_Page_01.png 147w\" sizes=\"(max-width: 1700px) 100vw, 1700px\" \/>\n    \n<\/figure>\n<div class='bb-wysiwyg'>\n    \n    <p><b>Please summarize your research. Why are your results notable?<\/b><\/p>\n<p><b>Anant Gupta: <\/b><span style=\"font-weight: 400;\">In this work that we&#8217;re presenting during <a href=\"https:\/\/finir2025.github.io\/\" target=\"_blank\" rel=\"noopener\">the 2nd Workshop on Financial Information Retrieval in the Era of Generative AI<\/a> (FinIR 2025), we showcase a generative AI-driven, agentic framework for dynamically extracting and organizing financial topics from corporate earnings call transcripts. Our system utilizes LLMs to autonomously identify, structure, and evolve a hierarchical topic ontology that captures emerging themes and their relationships over time. Unlike traditional topic models such as Latent Dirichlet Allocation (LDA), which rely on static, unsupervised distributions or pre-defined labels, our approach enables fine-grained, contextual tracking of financially-relevant narratives across companies and sectors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Our results are notable in three ways:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Dynamic topic discovery<\/b><span style=\"font-weight: 400;\">: The agentic system can surface emerging financial topics without requiring prior labeling, capturing evolving language in real time (e.g., the rise of &#8220;generative AI&#8221; or &#8220;cost reduction&#8221; post-2022).<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Ontology-grounded insights<\/b><span style=\"font-weight: 400;\">: We construct and validate a coherent multi-level topic ontology using semantic similarity and embedding-based coherence metrics, showing improved structural integrity compared to LDA baselines.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Actionable financial analytics<\/b><span style=\"font-weight: 400;\">: We demonstrate downstream applications like trend detection, competitor benchmarking, and identification of strategic differentiators, providing equity analysts with early, interpretable signals directly from unstructured text.<\/span><\/li>\n<\/ol>\n<p><b>How does your research advance the state-of-the-art in the field of information retrieval?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Our work advances IR by shifting from static, keyword-based topic detection toward dynamic, context-aware retrieval using agentic LLMs. While existing IR systems often rely on pre-indexed vocabularies or supervised classifiers, our approach uses <\/span><i><span style=\"font-weight: 400;\">semantic reasoning<\/span><\/i><span style=\"font-weight: 400;\"> to identify novel concepts as they appear, and <\/span><i><span style=\"font-weight: 400;\">self-updating structures<\/span><\/i><span style=\"font-weight: 400;\"> (ontologies) to organize them over time.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This agentic system acts not just as a retriever but as a <\/span><i><span style=\"font-weight: 400;\">semantic curator <\/span><\/i><span style=\"font-weight: 400;\">\u2014 it validates, contextualizes, and links topics across documents in a way that mirrors how a human analyst might build mental models over quarters of financial reporting. In doing so, we bridge the gap between deep retrieval and interpretability, enabling richer question-answering, longitudinal trend analysis, and sector-wide benchmarking.<\/span><\/p>\n\n<\/div>\n\n<\/div>\n\n\n\t\t\n\t<\/div>\n<\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>SIGIR 2025 papers by Bloomberg&#8217;s AI engineers present contributions aimed at making information retrieval systems more robust, effective, and efficient in the age of generative AI<\/p>\n","protected":false},"author":184,"featured_media":43016,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1466],"tags":[1498,1578,1472,2237,2236,2146,2325,2144,2145,2324,1477,2326,2323,1773],"class_list":["post-43003","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-at-bloomberg","tag-ai","tag-artificial-intelligence","tag-data-science","tag-gen-ai","tag-genai","tag-generative-ai","tag-information-retrieval","tag-large-language-models","tag-llms","tag-recommendation-systems","tag-research","tag-retrieval","tag-search","tag-sigir"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v19.11 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Bloomberg&#039;s AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP<\/title>\n<meta name=\"description\" content=\"SIGIR 2025 papers by Bloomberg&#039;s AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Bloomberg&#039;s AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP\" \/>\n<meta property=\"og:description\" content=\"SIGIR 2025 papers by Bloomberg&#039;s AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/\" \/>\n<meta property=\"og:site_name\" content=\"Bloomberg L.P.\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/bloomberglp\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-07-13T13:32:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-07-14T02:28:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"chaas30\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png\" \/>\n<meta name=\"twitter:creator\" content=\"@bloomberg\" \/>\n<meta name=\"twitter:site\" content=\"@bloomberg\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"chaas30\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/\",\"url\":\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/\",\"name\":\"Bloomberg's AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP\",\"isPartOf\":{\"@id\":\"https:\/\/www.bloomberg.com\/company\/#website\"},\"datePublished\":\"2025-07-13T13:32:40+00:00\",\"dateModified\":\"2025-07-14T02:28:26+00:00\",\"author\":{\"@id\":\"https:\/\/www.bloomberg.com\/company\/#\/schema\/person\/4d4a18aae79d6fcc1ea98181a906905e\"},\"description\":\"SIGIR 2025 papers by Bloomberg's AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":\"1\",\"name\":\"Home\",\"item\":\"https:\/\/www.bloomberg.com\/company\/\"},{\"@type\":\"ListItem\",\"position\":\"2\",\"name\":\"Bloomberg&#8217;s AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.bloomberg.com\/company\/#website\",\"url\":\"https:\/\/www.bloomberg.com\/company\/\",\"name\":\"Bloomberg L.P.\",\"description\":\"Bloomberg L.P. is the leader in global business and financial information, enabling customers to make smarter, faster, more informed business decisions.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.bloomberg.com\/company\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.bloomberg.com\/company\/#\/schema\/person\/4d4a18aae79d6fcc1ea98181a906905e\",\"name\":\"Bloomberg L.P.\",\"url\":\"https:\/\/www.bloomberg.com\/company\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Bloomberg's AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP","description":"SIGIR 2025 papers by Bloomberg's AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/","og_locale":"en_US","og_type":"article","og_title":"Bloomberg's AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP","og_description":"SIGIR 2025 papers by Bloomberg's AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.","og_url":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/","og_site_name":"Bloomberg L.P.","article_publisher":"https:\/\/www.facebook.com\/bloomberglp\/","article_published_time":"2025-07-13T13:32:40+00:00","article_modified_time":"2025-07-14T02:28:26+00:00","og_image":[{"width":1280,"height":720,"url":"https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png","type":"image\/png"}],"author":"chaas30","twitter_card":"summary_large_image","twitter_image":"https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png","twitter_creator":"@bloomberg","twitter_site":"@bloomberg","twitter_misc":{"Written by":"chaas30","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/","url":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/","name":"Bloomberg's AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025 | Bloomberg LP","isPartOf":{"@id":"https:\/\/www.bloomberg.com\/company\/#website"},"datePublished":"2025-07-13T13:32:40+00:00","dateModified":"2025-07-14T02:28:26+00:00","author":{"@id":"https:\/\/www.bloomberg.com\/company\/#\/schema\/person\/4d4a18aae79d6fcc1ea98181a906905e"},"description":"SIGIR 2025 papers by Bloomberg's AI researchers aim to make information retrieval systems more robust, effective, and efficient in the age of generative AI.","breadcrumb":{"@id":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.bloomberg.com\/company\/stories\/bloomberg-ai-engineers-publish-3-information-retrieval-research-papers-sigir-2025\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":"1","name":"Home","item":"https:\/\/www.bloomberg.com\/company\/"},{"@type":"ListItem","position":"2","name":"Bloomberg&#8217;s AI Engineers Publish 3 Information Retrieval Research Papers at SIGIR 2025"}]},{"@type":"WebSite","@id":"https:\/\/www.bloomberg.com\/company\/#website","url":"https:\/\/www.bloomberg.com\/company\/","name":"Bloomberg L.P.","description":"Bloomberg L.P. is the leader in global business and financial information, enabling customers to make smarter, faster, more informed business decisions.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.bloomberg.com\/company\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.bloomberg.com\/company\/#\/schema\/person\/4d4a18aae79d6fcc1ea98181a906905e","name":"Bloomberg L.P.","url":"https:\/\/www.bloomberg.com\/company"}]}},"featured_image_rendered":"<img srcset='https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 280w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 300w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1024w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 768w, https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png 1280w' src='https:\/\/assets.bbhub.io\/image\/v1\/resize?width=auto&type=webp&url=https:\/\/assets.bbhub.io\/company\/sites\/51\/2025\/07\/SIGIR-2025-Papers-updated.png' alt='' \/>","category_info":{"name":"Tech At Bloomberg","blog_landing_name":"Tech At Bloomberg"},"_links":{"self":[{"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/posts\/43003","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/users\/184"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/comments?post=43003"}],"version-history":[{"count":10,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/posts\/43003\/revisions"}],"predecessor-version":[{"id":43024,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/posts\/43003\/revisions\/43024"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/media\/43016"}],"wp:attachment":[{"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/media?parent=43003"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/categories?post=43003"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bloomberg.com\/company\/wp-json\/wp\/v2\/tags?post=43003"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}