Recommendation

Towards a more accurate metabarcoding approach for studying fungal communities of fermented foods

Caroline Strub based on reviews by Johannes Schweichhart and 2 anonymous reviewers

A recommendation of:

Comparison of metabarcoding taxonomic markers to describe fungal communities in fermented foods

Olivier Rué, Monika Coton, Eric Dugat-Bony, Kate Howell, Françoise Irlinger, Jean-Luc Legras, Valentin Loux, Elisa Michel, Jérôme Mounier, Cécile Neuvéglise, Delphine Sicard (2023), bioRxiv, ver.3, peer-reviewed and recommended by PCI Microbiology https://doi.org/10.1101/2023.01.13.523754

Read preprint in preprint server Now published in Peer Community Journal

Data used for results

Codes used in this study

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Comparison of metabarcoding taxonomic markers to describe fungal communities in fermented foods

Next generation sequencing offers several ways to study microbial communities. For agri-food sciences, identifying species in diverse food ecosystems is key for both food sustainability and food security. The aim of this study was to compare metabarcoding pipelines and markers to determine fungal diversity in food ecosystems, from Illumina short reads. We built mock communities combining the most representative fungal species in fermented meat, cheese, wine and bread. Four barcodes (ITS1, ITS2, D1/D2 and RPB2) were tested for each mock and on real fermented products. We created a database, including all mock species sequences for each barcode to compensate for the lack of curated data in available databases. Four bioinformatics tools (DADA2, QIIME, FROGS and a combination of DADA2 and FROGS) were compared. Our results clearly showed that the combined DADA2 and FROGS tool gave the most accurate results. Most mock community species were not identified by the RPB2 barcode due to unsuccessful barcode amplification. When comparing the three rDNA markers, ITS markers performed better than D1D2, as they are better represented in public databases and have better specificity to distinguish species. Between ITS1 and ITS2, differences in the best marker were observed according to the studied ecosystem. While ITS2 is best suited to characterize cheese, wine and fermented meat communities, ITS1 performs better for sourdough bread communities. Our results also emphasized the need for a dedicated database and enriched fungal-specific public databases with novel barcode sequences for 118 major species in food ecosystems.

metabarcoding, fermented foods, fermentation, metabarcoding, fungi, yeast, mock communities, barcode marker comparison

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

مقارنة العلامات التصنيفية metabarcoding لوصف المجتمعات الفطرية في الأطعمة المخمرة

يوفر تسلسل الجيل التالي عدة طرق لدراسة المجتمعات الميكروبية. بالنسبة لعلوم الأغذية الزراعية، يعد تحديد الأنواع في النظم البيئية الغذائية المتنوعة أمرًا أساسيًا لكل من الاستدامة الغذائية والأمن الغذائي. كان الهدف من هذه الدراسة هو مقارنة خطوط الأنابيب والعلامات الاستقلابية لتحديد التنوع الفطري في النظم الإيكولوجية الغذائية، من قراءات Illumina القصيرة. قمنا ببناء مجتمعات وهمية تجمع بين الأنواع الفطرية الأكثر تمثيلاً في اللحوم المخمرة والجبن والنبيذ والخبز. تم اختبار أربعة رموز شريطية (ITS1 وITS2 وD1/D2 وRPB2) لكل منتج مخمر وهمي وحقيقي. لقد أنشأنا قاعدة بيانات، بما في ذلك جميع تسلسلات الأنواع الوهمية لكل رمز شريطي للتعويض عن نقص البيانات المنسقة في قواعد البيانات المتاحة. وتمت مقارنة أربع أدوات المعلوماتية الحيوية (DADA2، QIIME، FROGS ومجموعة من DADA2 وFROGS). أظهرت نتائجنا بوضوح أن أداة DADA2 وFROGS المدمجة أعطت النتائج الأكثر دقة. لم يتم التعرف على معظم أنواع المجتمعات الوهمية بواسطة الباركود RPB2 بسبب تضخيم الباركود غير الناجح. عند مقارنة علامات rDNA الثلاثة، كان أداء علامات ITS أفضل من D1D2، حيث يتم تمثيلها بشكل أفضل في قواعد البيانات العامة ولها خصوصية أفضل لتمييز الأنواع. بين ITS1 وITS2، لوحظت اختلافات في أفضل علامة وفقا للنظام البيئي المدروس. في حين أن ITS2 هو الأنسب لوصف مجتمعات الجبن والنبيذ واللحوم المخمرة، فإن أداء ITS1 أفضل لمجتمعات الخبز المخمر. أكدت نتائجنا أيضًا على الحاجة إلى قاعدة بيانات مخصصة وقواعد بيانات عامة غنية خاصة بالفطريات مع تسلسلات باركود جديدة لـ 118 نوعًا رئيسيًا في النظم البيئية الغذائية.

الترميز الاستقلابي، الأطعمة المخمرة، التخمير، الترميز الاستقلابي، الفطريات، الخميرة، المجتمعات الوهمية، مقارنة علامات الباركود

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Comparación de marcadores taxonómicos de metabarcodes para describir comunidades de hongos en alimentos fermentados

La secuenciación de próxima generación ofrece varias formas de estudiar comunidades microbianas. Para las ciencias agroalimentarias, identificar especies en diversos ecosistemas alimentarios es clave tanto para la sostenibilidad como para la seguridad alimentaria. El objetivo de este estudio fue comparar marcadores y canalizaciones de metacódigos de barras para determinar la diversidad de hongos en los ecosistemas alimentarios, a partir de lecturas breves de Illumina. Construimos comunidades simuladas combinando las especies de hongos más representativas en carne fermentada, queso, vino y pan. Se probaron cuatro códigos de barras (ITS1, ITS2, D1/D2 y RPB2) para cada producto fermentado simulado y real. Creamos una base de datos que incluye todas las secuencias de especies simuladas para cada código de barras para compensar la falta de datos seleccionados en las bases de datos disponibles. Se compararon cuatro herramientas bioinformáticas (DADA2, QIIME, FROGS y una combinación de DADA2 y FROGS). Nuestros resultados mostraron claramente que la herramienta combinada DADA2 y FROGS dio los resultados más precisos. La mayoría de las especies de comunidades simuladas no fueron identificadas por el código de barras RPB2 debido a una amplificación fallida del código de barras. Al comparar los tres marcadores de ADNr, los marcadores ITS obtuvieron mejores resultados que el D1D2, ya que están mejor representados en las bases de datos públicas y tienen una mejor especificidad para distinguir especies. Entre ITS1 e ITS2 se observaron diferencias en el mejor marcador según el ecosistema estudiado. Si bien ITS2 es más adecuado para caracterizar comunidades de queso, vino y carne fermentada, ITS1 funciona mejor para comunidades de pan de masa madre. Nuestros resultados también enfatizaron la necesidad de una base de datos dedicada y bases de datos públicas enriquecidas y específicas de hongos con nuevas secuencias de códigos de barras para 118 especies principales en los ecosistemas alimentarios.

metabarcoding, alimentos fermentados, fermentación, metabarcoding, hongos, levadura, comunidades simuladas, comparación de marcadores de códigos de barras

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Comparaison des marqueurs taxonomiques de métabarcoding pour décrire les communautés fongiques dans les aliments fermentés

Le séquençage de nouvelle génération offre plusieurs façons d'étudier les communautés microbiennes. Pour les sciences agroalimentaires, l’identification des espèces dans divers écosystèmes alimentaires est essentielle à la fois pour la durabilité et la sécurité alimentaire. Le but de cette étude était de comparer les pipelines et les marqueurs de métabarcoding pour déterminer la diversité fongique dans les écosystèmes alimentaires, à partir de courtes lectures d’Illumina. Nous avons construit des communautés simulées combinant les espèces fongiques les plus représentatives de la viande fermentée, du fromage, du vin et du pain. Quatre codes-barres (ITS1, ITS2, D1/D2 et RPB2) ont été testés pour chaque maquette et sur de vrais produits fermentés. Nous avons créé une base de données comprenant toutes les séquences d'espèces factices pour chaque code-barres afin de compenser le manque de données organisées dans les bases de données disponibles. Quatre outils bioinformatiques (DADA2, QIIME, FROGS et une combinaison de DADA2 et FROGS) ont été comparés. Nos résultats ont clairement montré que l’outil combiné DADA2 et FROGS donnait les résultats les plus précis. La plupart des espèces de communautés fictives n’ont pas été identifiées par le code-barres RPB2 en raison d’un échec d’amplification du code-barres. En comparant les trois marqueurs d'ADNr, les marqueurs ITS ont obtenu de meilleurs résultats que D1D2, car ils sont mieux représentés dans les bases de données publiques et ont une meilleure spécificité pour distinguer les espèces. Entre ITS1 et ITS2, des différences sur le meilleur marqueur ont été observées selon l'écosystème étudié. Alors que ITS2 est le mieux adapté pour caractériser les communautés de fromage, de vin et de viande fermentée, ITS1 est plus performant pour les communautés de pain au levain. Nos résultats ont également souligné la nécessité d'une base de données dédiée et d'enrichissement des bases de données publiques spécifiques aux champignons avec de nouvelles séquences de codes-barres pour 118 espèces majeures dans les écosystèmes alimentaires.

métabarcoding, aliments fermentés, fermentation, métabarcoding, champignons, levure, communautés fictives, comparaison de marqueurs de codes à barres

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

किण्वित खाद्य पदार्थों में कवक समुदायों का वर्णन करने के लिए मेटाबार्कोडिंग टैक्सोनोमिक मार्करों की तुलना

अगली पीढ़ी का अनुक्रमण माइक्रोबियल समुदायों का अध्ययन करने के कई तरीके प्रदान करता है। कृषि-खाद्य विज्ञान के लिए, विविध खाद्य पारिस्थितिकी प्रणालियों में प्रजातियों की पहचान करना खाद्य स्थिरता और खाद्य सुरक्षा दोनों के लिए महत्वपूर्ण है। इस अध्ययन का उद्देश्य इलुमिना के संक्षिप्त अध्ययन के अनुसार खाद्य पारिस्थितिकी प्रणालियों में फंगल विविधता का निर्धारण करने के लिए मेटाबार्कोडिंग पाइपलाइनों और मार्करों की तुलना करना था। हमने किण्वित मांस, पनीर, वाइन और ब्रेड में सबसे अधिक प्रतिनिधि कवक प्रजातियों को मिलाकर नकली समुदाय बनाए। प्रत्येक नकली और वास्तविक किण्वित उत्पादों पर चार बारकोड (ITS1, ITS2, D1/D2 और RPB2) का परीक्षण किया गया। हमने उपलब्ध डेटाबेस में क्यूरेटेड डेटा की कमी की भरपाई के लिए प्रत्येक बारकोड के लिए सभी नकली प्रजातियों के अनुक्रमों को शामिल करते हुए एक डेटाबेस बनाया। चार जैव सूचना विज्ञान उपकरण (DADA2, QIIME, FROGS और DADA2 और FROGS का संयोजन) की तुलना की गई। हमारे परिणामों ने स्पष्ट रूप से दिखाया कि संयुक्त DADA2 और FROGS टूल ने सबसे सटीक परिणाम दिए। असफल बारकोड प्रवर्धन के कारण अधिकांश नकली समुदाय प्रजातियों की पहचान आरपीबी2 बारकोड द्वारा नहीं की गई। तीन आरडीएनए मार्करों की तुलना करते समय, आईटीएस मार्करों ने डी1डी2 की तुलना में बेहतर प्रदर्शन किया, क्योंकि वे सार्वजनिक डेटाबेस में बेहतर प्रतिनिधित्व करते हैं और प्रजातियों को अलग करने के लिए बेहतर विशिष्टता रखते हैं। ITS1 और ITS2 के बीच, अध्ययन किए गए पारिस्थितिकी तंत्र के अनुसार सर्वोत्तम मार्कर में अंतर देखा गया। जबकि ITS2 पनीर, वाइन और किण्वित मांस समुदायों को चिह्नित करने के लिए सबसे उपयुक्त है, ITS1 खट्टी रोटी समुदायों के लिए बेहतर प्रदर्शन करता है। हमारे परिणामों ने खाद्य पारिस्थितिकी तंत्र में 118 प्रमुख प्रजातियों के लिए नए बारकोड अनुक्रमों के साथ एक समर्पित डेटाबेस और समृद्ध कवक-विशिष्ट सार्वजनिक डेटाबेस की आवश्यकता पर भी जोर दिया।

मेटाबार्कोडिंग, किण्वित खाद्य पदार्थ, किण्वन, मेटाबार्कोडिंग, कवक, खमीर, नकली समुदाय, बारकोड मार्कर तुलना

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

発酵食品中の真菌群集を説明するためのメタバーコーディング分類マーカーの比較

次世代シーケンスは、微生物群集を研究するためのいくつかの方法を提供します。農食品科学にとって、多様な食品生態系における種を特定することは、食料の持続可能性と食料安全保障の両方にとって鍵となります。この研究の目的は、イルミナのショートリードからメタバーコーディングパイプラインとマーカーを比較して、食品生態系における真菌の多様性を判断することでした。私たちは、発酵肉、チーズ、ワイン、パンに含まれる最も代表的な菌種を組み合わせた模擬コミュニティを構築しました。 4 つのバーコード (ITS1、ITS2、D1/D2、RPB2) が各モックおよび実際の発酵製品に対してテストされました。私たちは、利用可能なデータベースに厳選されたデータの不足を補うために、各バーコードのすべての疑似種配列を含むデータベースを作成しました。 4 つのバイオインフォマティクスツール (DADA2、QIIME、FROGS、および DADA2 と FROGS の組み合わせ) が比較されました。私たちの結果は、DADA2 と FROGS ツールを組み合わせた方が最も正確な結果が得られることを明確に示しました。バーコード増幅が失敗したため、ほとんどの模擬群集種は RPB2 バーコードでは識別されませんでした。 3 つの rDNA マーカーを比較すると、ITS マーカーは D1D2 よりも優れたパフォーマンスを示しました。これは、ITS マーカーが公的データベースでよりよく表現されており、種を区別するための特異性が優れているためです。 ITS1 と ITS2 の間では、調査対象の生態系に応じて最適なマーカーの違いが観察されました。 ITS2 はチーズ、ワイン、発酵肉のコミュニティを特徴付けるのに最適ですが、サワー種パンのコミュニティでは ITS1 の方が優れています。私たちの結果はまた、専用のデータベースと、食品生態系の主要な 118 種の新規バーコード配列を備えた真菌類に特化した充実した公開データベースの必要性を強調しました。

メタバーコーディング、発酵食品、発酵、メタバーコーディング、菌類、酵母、モックコミュニティ、バーコードマーカー比較

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Comparação de marcadores taxonômicos metabarcoding para descrever comunidades fúngicas em alimentos fermentados

O sequenciamento de próxima geração oferece diversas maneiras de estudar comunidades microbianas. Para as ciências agroalimentares, a identificação de espécies em diversos ecossistemas alimentares é fundamental tanto para a sustentabilidade alimentar como para a segurança alimentar. O objetivo deste estudo foi comparar pipelines e marcadores de metabarcoding para determinar a diversidade de fungos em ecossistemas alimentares, a partir de leituras curtas da Illumina. Construímos comunidades simuladas combinando as espécies de fungos mais representativas em carne fermentada, queijo, vinho e pão. Quatro códigos de barras (ITS1, ITS2, D1/D2 e RPB2) foram testados para cada simulação e em produtos fermentados reais. Criamos um banco de dados, incluindo todas as sequências simuladas de espécies para cada código de barras, para compensar a falta de dados selecionados nos bancos de dados disponíveis. Quatro ferramentas de bioinformática (DADA2, QIIME, FROGS e uma combinação de DADA2 e FROGS) foram comparadas. Nossos resultados mostraram claramente que a ferramenta combinada DADA2 e FROGS forneceu os resultados mais precisos. A maioria das espécies de comunidades simuladas não foram identificadas pelo código de barras RPB2 devido à amplificação malsucedida do código de barras. Ao comparar os três marcadores de rDNA, os marcadores ITS tiveram melhor desempenho que o D1D2, pois estão melhor representados em bancos de dados públicos e possuem melhor especificidade para distinguir espécies. Entre ITS1 e ITS2 foram observadas diferenças no melhor marcador de acordo com o ecossistema estudado. Embora o ITS2 seja mais adequado para caracterizar comunidades de queijo, vinho e carne fermentada, o ITS1 tem melhor desempenho para comunidades de pão fermentado. Nossos resultados também enfatizaram a necessidade de um banco de dados dedicado e de bancos de dados públicos específicos de fungos enriquecidos com novas sequências de códigos de barras para 118 espécies principais em ecossistemas alimentares.

metabarcoding, alimentos fermentados, fermentação, metabarcoding, fungos, leveduras, comunidades simuladas, comparação de marcadores de código de barras

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Сравнение таксономических маркеров метабаркодирования для описания грибковых сообществ в ферментированных продуктах

Секвенирование нового поколения предлагает несколько способов изучения микробных сообществ. Для агропродовольственных наук выявление видов в разнообразных пищевых экосистемах имеет ключевое значение как для продовольственной устойчивости, так и для продовольственной безопасности. Целью этого исследования было сравнение конвейеров и маркеров метабаркодирования для определения разнообразия грибов в пищевых экосистемах, согласно коротким чтениям Illumina. Мы создали макеты сообществ, объединив наиболее представительные виды грибов в ферментированном мясе, сыре, вине и хлебе. Четыре штрих-кода (ITS1, ITS2, D1/D2 и RPB2) были протестированы для каждого макета и на реальных ферментированных продуктах. Мы создали базу данных, включающую все последовательности ложных видов для каждого штрих-кода, чтобы компенсировать отсутствие тщательно подобранных данных в доступных базах данных. Были сравнены четыре инструмента биоинформатики (DADA2, QIIME, FROGS и комбинация DADA2 и FROGS). Наши результаты ясно показали, что комбинированный инструмент DADA2 и FROGS дает наиболее точные результаты. Большинство видов ложных сообществ не были идентифицированы по штрих-коду RPB2 из-за неудачной амплификации штрих-кода. При сравнении трех маркеров рДНК маркеры ITS показали лучшие результаты, чем D1D2, поскольку они лучше представлены в общедоступных базах данных и обладают большей специфичностью для различения видов. Между ITS1 и ITS2 наблюдались различия в лучшем маркере в зависимости от изучаемой экосистемы. Хотя ITS2 лучше всего подходит для характеристики сообществ сыра, вина и ферментированного мяса, ITS1 лучше подходит для сообществ хлеба на закваске. Наши результаты также подчеркнули необходимость создания специальной базы данных и обогащения общедоступных баз данных по грибам новыми последовательностями штрих-кодов для 118 основных видов пищевых экосистем.

метабаркодирование, ферментированные продукты, ферментация, метабаркодирование, грибы, дрожжи, ложные сообщества, сравнение маркеров штрих-кода

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

描述发酵食品中真菌群落的元条形码分类标记的比较

下一代测序提供了多种研究微生物群落的方法。对于农业食品科学来说，识别不同食物生态系统中的物种对于粮食可持续性和粮食安全至关重要。本研究的目的是比较元条形码管道和标记，以确定食品生态系统中的真菌多样性（来自 Illumina 短读长）。我们建立了模拟群落，结合了发酵肉、奶酪、葡萄酒和面包中最具代表性的真菌物种。针对每个模拟产品和真实发酵产品测试了四个条形码（ITS1、ITS2、D1/D2 和 RPB2）。我们创建了一个数据库，包括每个条形码的所有模拟物种序列，以弥补可用数据库中缺乏整理数据的情况。比较了四种生物信息学工具（DADA2、QIIME、FROGS 以及 DADA2 和 FROGS 的组合）。我们的结果清楚地表明，DADA2 和 FROGS 组合工具给出了最准确的结果。由于条形码扩增不成功，大多数模拟群落物种未被 RPB2 条形码识别。当比较三种 rDNA 标记时，ITS 标记比 D1D2 表现更好，因为它们在公共数据库中具有更好的代表性，并且具有更好的区分物种的特异性。在 ITS1 和 ITS2 之间，根据所研究的生态系统观察到最佳标记的差异。虽然 ITS2 最适合描述奶酪、葡萄酒和发酵肉群体的特征，但 ITS1 对于酵母面包群体的表现更好。我们的结果还强调需要一个专门的数据库和丰富的真菌特异性公共数据库，其中包含食物生态系统中 118 个主要物种的新颖条形码序列。

元条形码、发酵食品、发酵、元条形码、真菌、酵母、模拟群落、条形码标记比较

Submission: posted 20 January 2023, validated 20 January 2023
Recommendation: posted 25 August 2023, validated 29 August 2023

Cite this recommendation as:
Strub, C. (2023) Towards a more accurate metabarcoding approach for studying fungal communities of fermented foods. Peer Community in Microbiology, 100007. https://doi.org/10.24072/pci.microbiol.100007

Recommendation

Improved characterization of food microbial ecosystems, especially those fermented is key to the development of food sustainability. Short-read metabarcoding is one of the most popular ways to study microbial communities. However, this approach remains complex because of the locks and biases it may entail particularly when applied to fungal communities.

Building and using four mock communities from fermented food (bread, wine, cheese, fermented meat), Rué et al., 2023 demonstrate that combined DADA2 denoising algorithm followed to the FROGS tools gives a more accurate description of fungal communities compared to several commonly used bioinformatic workflows, dealing with all amplicon lengths. Moreover, Rué et al., 2023 provide guidance on which barcode to use (ITS1, ITS2, D1/D2 and RPB2), depending on the fermented food studied.

Practices in metabarcoding of fungi have been recently reviewed by Tedersoo et al., 2022 and their synthesis comes to the same conclusion as Rué et al., 2023. As the reference databases are far from being complete notably for food ecosystems, the development of specific sequences public databases will enable the scientific community to lift the veil on this whole area of microbial ecology.

The study conducted by Rué et al. (2023) provides a particularly detailed approach from a technical point of view, which contributes to improving the general practices in the metabarcoding of fungi. The design and the use of mock communities to compare the performances of the different pipelines is a strong point of this study. Another key element is the creation and use of an in-house database of fungal barcode sequences which improved the species-level affiliations

However, the study of fungal communities by metabarcoding is still a promising avenue of research in agri-food sciences. Thus, short-read sequencing, combined with suitable pipelines and databases, should remain of interest to the microbial ecology community (Pauvert et al., 2019; Furneaux et al., 2021).

References

Furneaux, B., Bahram, M., Rosling, A., Yorou, N. S., & Ryberg, M. (2021). Long‐and short‐read metabarcoding technologies reveal similar spatiotemporal structures in fungal communities. Molecular Ecology Resources, 21(6), 1833-1849. https://doi.org/10.1111/1755-0998.13387

Pauvert, C., Buée, M., Laval, V., Edel-Hermann, V., Fauchery, L., Gautier, A., ... & Vacher, C. (2019). Bioinformatics matters: The accuracy of plant and soil fungal community data is highly dependent on the metabarcoding pipeline. Fungal Ecology, 41, 23-33. https://doi.org/10.1016/j.funeco.2019.03.005

Rué, O., Coton, M., Dugat-Bony, E., Howell, K., Irlinger, F., Legras, J. L., ... & Sicard, D. (2023). Comparison of metabarcoding taxonomic markers to describe fungal communities in fermented foods. BioRxiv, 2023-0113.523754, ver.3 peer-reviewed and recommended by Peer Community in Microbiology. https://doi.org/10.1101/2023.01.13.523754

Tedersoo, L., Bahram, M., Zinger, L., Nilsson, R. H., Kennedy, P. G., Yang, T., ... & Mikryukov, V. (2022). Best practices in metabarcoding of fungi: From experimental design to results. Molecular ecology, 31(10), 2769-2795. https://doi.org/10.1111/mec.16460

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Funding:
This work was supported by the French “Microbial Ecosystems & Meta-omics” (MEM) metaprogram from INRAE. Migale is part of the Institut Français de Bioinformatique (ANR-11-INBS-0013).

Reviews

Evaluation round #2

DOI or URL of the preprint: https://doi.org/10.1101/2023.01.13.523754

Version of the preprint: 2

Author's Reply, 08 Aug 2023

Download author's reply Download tracked changes file

Dear recommender,

We have answered all reviewer comments and hope this version will meet PCI requirments.

My best regards

Delphine Sicard

https://doi.org/10.24072/pci.microbiol.100007.ar2

Decision by Caroline Strub, posted 02 Aug 2023, validated 02 Aug 2023

Dear authors,

Could you consider the minor comments, in particular the one about methods (line 330 : Is it an OTU represented by a centroid, a Swarm seed or a denoised sequence variant ?) and the manuscript will be ready to be recommended.

Sincerely,

Caroline Strub

https://doi.org/10.24072/pci.microbiol.100007.d2

Reviewed by Johannes Schweichhart, 01 Aug 2023

I only have a few minor comments and one issue which has been adressed before and is not resolved yet. Otherwise I would recommend this preprint for publication.

Ad Introduction:

Line 81: This sentence does not really reflect the findings of the Ihrmarks paper and contradicts the findings in the preprint which shows a high divergence rate for all pipelines.

L114: I guess "downside" not "downfall" is meant here.

L124: Building correct biological sequences is beside the point of traditional de novo clustering.

Ad Methods:

L330: I thank the authors for their answer but apparently no changes were made in this respect in the preprints methods. To be more explicit: A difference that can make a lot of difference, especially when talking about perfect matches to reference sequences, is what is compared with that reference sequence - is it an OTU represented by a centroid, a Swarm seed or a denoised sequence variant? This is not implicit for every pipeline and "following authors guidelines" is too unspecific for USEARCH and Qiime. I guess for USEARCH the authors refer to "recommended procedures" at https://drive5.com/usearch/manual/. There both, OTU clustering and denoising, are given which makes this reference ambiguous on how things have been done in the preprint. Similar is true for Qiime. It should not be necessary for the reader to screen the code in the supplementary just to get the information if the respective pipeline was using ZOTUs, ASVs or OTUs.

Ad Discussion:

L680: It is rather likely that all primers have missmatches with certain groups of fungi.

https://doi.org/10.24072/pci.microbiol.100007.rev21

Reviewed by anonymous reviewer 2, 02 Aug 2023

The amended version and the author's responses are satisfactory.

https://doi.org/10.24072/pci.microbiol.100007.rev22

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.1101/2023.01.13.523754

Version of the preprint: 1

Author's Reply, 05 Jul 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.microbiol.100007.ar1

Decision by Caroline Strub, posted 19 Jun 2023, validated 19 Jun 2023

Dear Authors,

I would like to apologize for the delay in the handling process of your preprint.
You present a novel approach to solve the issue concerning the length polymorphism of ITS1 and ITS2 sequences in metabarcoding of fungi.

Both reviewers and I agree this is a relevant study which requires moderate revision, following comments by the reviewers.

Sincerely,

Caroline Strub

https://doi.org/10.24072/pci.microbiol.100007.d1

Reviewed by anonymous reviewer 1, 04 Apr 2023

This study entitled “Comparison of metabarcoding taxonomic markets to describe fungal communities in fermented food” can be divided into two sub-sections: First a comparison of mock communities of four common fungal identification markers (ITS1, ITS2, D1/D2, and RPB2) and seven bioinformatics workflows (using four different bioinformatics tools) including the most common approaches (OUT, ASV, ZOTUs) with a focus on fermented foods using four fermented food models (bread, wine, cheese, fermented meat).

The title reflects the content of the paper and the main results of the study are summarised in the abstract. The research question is very relevant to the field of food microbiology and microbial ecology and is well addressed, using relevant approaches and tools.

This paper provides an excellent contribution to the field of food microbiology. The authors demonstrate a thorough understanding of the subject matter and present a well-designed study that compares different metabarcoding pipelines and markers to determine fungal diversity in food ecosystems, with a focus on fermented foods.

The use of mock communities to validate the bioinformatics tools is a particularly strong aspect of this study, as it allows for rigorous testing of the pipelines in a controlled setting.

The comparison of four bioinformatics pipelines, including DADA2, QIIME, FROGS, and a combination of DADA2 and FROGS, is also noteworthy. The authors' demonstration of the superiority of the combined DADA2 and FROGS tools will be of interest to researchers in this field.

The paper highlights the importance of selecting appropriate markers, with the authors finding that ITS markers performed better than D1D2. The study provides guidance on the best markers for different food ecosystems, with ITS2 being best suited to characterize cheese, wine, and fermented meat communities, while ITS1 performs better for sourdough bread communities.

Overall, this scientific paper presents a thorough and well-executed study that makes a valuable contribution to the field of bioinformatics. The questions addressed are relevant to the field and the results will be of interest to researchers in agri-food sciences and microbial ecology, and the paper provides a framework for future research in this area.

I therefore recommend this paper for submission.

I also have some specific comments for the author to address:

- The relevance of figure 1 is questionable. It does not bring a substantial amount of information. Moreover, data in the text below figure 1 do not seem to confirm the data represented in figure 1: for example for meat, the text states that 4 species per genus were found for Yarrowia and Cladosporium and 2 species for Candida only. However, in figure 1, only a single dot can be found at 4 species/genus and 3 dots at 2 species/genera. Either there’s a consistency problem between the figure and the text (especially for meat) or the data are confusingly expressed.

Species names should be italicized (lines 249, 250, 455, 513)
The ITS region being subjected to significant size polymorphisms (insertion/deletions) as shown in fig2 and fig 6, It is often difficult to interpret/make sense of phylogenetic trees built on sequence alignments. It might be interesting for the authors to elaborate and discuss on the relevance of the trees obtained and shown in fig3
Figure 5:
- Panels should be numbered or labeled (A, B, C, D)
- Precision should be provided in the legend to clarify the difference between the small dots and the bigger dots
Figure 6:
- Figure legend stats “ITS1 and ITS2 amplicon size…”. However, it seems that only ITS1 data are presented.
Figure 8:
- chosen colors make it difficult to distinguish partially reconstructed and perfectly reconstructed sequences. A better choice of colors would greatly benefit the readability of the figure

https://doi.org/10.24072/pci.microbiol.100007.rev11

Reviewed by Johannes Schweichhart, 18 Jun 2023

Provide a detailed, objective report on the merits of the preprint.

Rue et al. benchmark combinations of four barcoding marker regions (ITS1, ITS2, RBP2 and D1/D2 LSU) and bioinformatic pipelines (USEARCH, QIIME2, DADA2 and FROGS) in terms of their capability to recover fungal species and type sequences from four mock communities for fermented meat, wine, cheese and sourdough (118 fungal species in total). Based on merged data from all four marker regions the authors conclude that a combination of FROGS procedures and the DADA2 denoising algorithm shows better performance than the other pipelines tested. They then apply this procedure to fermented meat, wine, cheese and sourdough samples (n=24) and provide a description of their findings.

Identify flaws (if any) in the design of the research, and in the analysis and interpretation of results.

Ad Fig. 3: To my knowledge, phylogenetic relationships derived from ITS1 and ITS2 sequences become unreliable above class level. If the authors want to include this figure, I would suggest to use LSU sequences instead.
Ad "Analysis of real samples" (lines 449 ff.): Results are only discussed in the context of marker choice. This might be missleading. In the case of missing Yarrowia and Candida species when using the ITS1 marker, this can more likely be attributed to the primers used (ITS1F and ITS2) which have known missmatches to those taxa (see Tedersoo and Lindahl, 2016).

Expose your concerns (if any) about ethics or scientific misconduct.

No concerns.

State the preprint’s strengths as well as its weaknesses. Try to consider both the technical merit and the scientific significance.

An unsolved issue in short-read metabarcoding of fungi is the length polymorphism of ITS1 and ITS2 sequences which commonly leads to the exclusion of taxa with longer variants of those markers. The authors of FROGS introduce a novel approach to solve this issue and, based on simulated data, show that this leads to higher recovery rates of fungal taxa without compromising on precision in the taxonomic classification of the fungal community. In this preprint, this approach is extended to four barcoding marker sites (of which one is out of range for the short-read sequencing platforms) and four fungal mock communities of fermented foods. While not completely novel to this preprint it is a promising methodological approach which has potential to improve the general practice in the metabarcoding of fungi and should be promoted.
The preprint is very detailed on the technical side of things but rather brief concerning the biological backgrouns and the presentation of the results for real food samples. Thus, the motivation for this study is not very clear. The comparison of the results of fermented food samples is purely descriptive and and limited to a few selected findings which are only attributed to targeted marker sites. As mentioned briefly by the authors, primer bias is another important factor and should be included. A graphical representation of these results is missing.

If there is something critically missing, report it.

Lines 125 ff.: A key component of this study are fungal mock communities representative for fermented meat, wine, cheese and sourdough. The authors write that the selection of those species was "based on an inventory of the most frequently described species in the literature". Yet, neither references for the literature sources used for that purpose nor further methodological insight how these species were selected are given.
Lines 294 ff.: Which algorithms were used here? UPARSE or UNOISE (i.e. OTU- or ZOTU delineation)? If UPARSE the comparison of perfectly reconstructed sequences (Fig. 5) would not be sound since OTUs would be compared to ASVs. Similarly, was QIIME2 used with the DADA2 plugin or using a OTU clustering algorithm?

Provide specific suggestions for improvements.

Ad Fig. 5: In my opinion, the outcomes of the benchmark of the pipelines used would be more clear and transparent if the results from all four marker sites would be presented separately and not mixed together.
Consider re-running some of the analyses using the current version of UNITE (v9.0). The version used here is >2 years old and since then ~9x more fungal sequences and >25% more fungal reference sequences have been added to the database.
Colours with better contrast could be picked in the Fig. 8 to discriminate partially and perfectly reconstructed sequences.

https://doi.org/10.24072/pci.microbiol.100007.rev12

User comments

No user comments yet

or Register
Submit a preprint