Salmonella infections remain a significant public health challenge in the U.S., affecting approximately 1.35 million people annually, leading to hospitalizations and even fatalities. Until now, pinpointing the exact sources of these infections has been difficult due to the complexity of transmission pathways. However, researchers from the Centers for Disease Control and Prevention (CDC), the Food and Drug Administration (FDA), and the U.S. Department of Agriculture (USDA) have leveraged whole-genome sequencing (WGS) and machine learning to trace the origins of these infections.
The study, published in Emerging Infectious Diseases, analyzed a massive dataset of Salmonella isolates from food and animal sources, revealing that chicken and vegetables are responsible for nearly three-quarters of all human cases. This finding has critical implications for food safety policies and prevention strategies.
The Role of Whole-Genome Sequencing in Tracking Salmonella
Traditional laboratory methods have often fallen short in identifying foodborne Salmonella outbreaks, attributing only about 5% of cases to known sources. With the advent of WGS, scientists can now analyze the genetic makeup of bacteria, improving accuracy in tracing contamination.
The researchers compiled 18,661 Salmonella isolates from food and animal samples stored in the National Center for Biotechnology Information (NCBI). These samples were collected from U.S. government agencies, including the FDA, USDA’s Food Safety and Inspection Service (FSIS), and the CDC. Using this vast dataset, scientists categorized the isolates into 15 distinct food groups and applied machine learning techniques to identify the most likely sources of infection.
Additionally, 6,470 human Salmonella isolates were obtained from the Foodborne Diseases Active Surveillance Network (FoodNet), covering roughly 15% of the U.S. population between 2014 and 2017. This provided a strong basis for comparison between human infections and potential food sources.
Key Findings: Chicken and Vegetables as Primary Culprits
The research team’s machine learning model analyzed 6,470 human Salmonella cases and found that:
-
34% of infections were linked to chicken
-
30% were traced to vegetables
-
12% came from turkey
-
11% were associated with pork
This means that 73% of Salmonella infections in the U.S. originate from chicken and vegetables. The study also identified the most common Salmonella serotypes associated with these food sources:
-
Chicken was primarily linked to Enteritidis, Typhimurium, Heidelberg, and Infantis
-
Vegetables were mainly associated with Javiana and Newport
-
Pork was the dominant source for Salmonella enterica 4,[5],12:i:− (STM)
Why Are Chicken and Vegetables High-Risk Foods?
1. Chicken and Poultry Processing Challenges
Raw chicken is particularly vulnerable to Salmonella contamination because the bacteria naturally reside in poultry intestines. Poor hygiene in processing plants, cross-contamination in kitchens, and undercooking all contribute to the high infection rate.
2. Vegetables and Agricultural Practices
Unlike meat, vegetables become contaminated through exposure to contaminated soil, water, or animal manure. Salmonella can survive in moist environments and even inside plant tissues, making it difficult to eliminate through washing alone. Contaminated irrigation water and improper handling during processing further exacerbate the risk.
The Machine Learning Approach and Its Accuracy
The study used a Random Forest machine learning algorithm, which relies on multiple genetic markers to classify data. The model achieved impressive accuracy rates for different food sources:
-
97% accuracy for chicken
-
82% for vegetables
-
88% for turkey
-
83% for pork
-
77% for beef
These high accuracy rates indicate that machine learning, combined with WGS, can provide precise insights into the sources of Salmonella outbreaks.
Implications for Food Safety and Public Health
Given the findings, public health officials and regulatory agencies need to strengthen food safety measures for poultry and fresh produce. Some recommended actions include:
-
Stricter Poultry Regulations – The USDA and FDA should impose tighter controls on poultry processing plants, ensuring better hygiene and reducing cross-contamination risks.
-
Improved Irrigation and Farming Practices – Farmers should be encouraged to use cleaner irrigation water and avoid animal manure that might be contaminated with Salmonella.
-
Enhanced Consumer Awareness – Educating the public on safe food handling, including thorough cooking of poultry and proper washing of vegetables, can help reduce infection risks.
-
Wider Use of Whole-Genome Sequencing – Expanding WGS surveillance can improve outbreak tracking and response, ensuring quicker interventions.
-
Increased Monitoring of Non-Food Sources – Many infections remain unattributed. Studying environmental and wildlife sources can help refine outbreak tracking methods.
Challenges and Future Directions
Despite the success of this study, some challenges remain. About 44% of Salmonella cases could not be definitively classified due to limited data on certain food sources. Future research should:
-
Expand the database with more non-chicken isolates to balance the dataset.
-
Collect samples from environmental and wildlife sources to understand alternative transmission routes.
-
Improve data collection from different geographic regions to enhance accuracy.
This landmark study has reshaped our understanding of Salmonella transmission, confirming that chicken and vegetables are the leading sources of infection in the U.S. The combination of whole-genome sequencing and machine learning has provided unprecedented accuracy in identifying these sources, paving the way for improved food safety policies.
As we move forward, enhanced surveillance, better consumer education, and stricter regulatory measures will be crucial in reducing Salmonella-related illnesses and ensuring a safer food supply for everyone.