

NEXT.BIO 2025 - Sustainability & ClimaTech through the power of AI driven enzyme engineering
In October 2025, A16Z #SFTechWeek will once again take over San Francisco. We are organizing NEXT.BIO, a world-class week long tech conference featuring the best startups, industry leaders and investors in AI x BIO across a series of Tech Events, a unique Hackathon and a DEMO Day 🚀
This event is one of our Tech Event series part of the full-week conference.
🧩 Suggested Themes & Challenges:
What is are biomining available datasets?
1. Rare Earth Element Biomining with Engineered E. coli
What it is: Adsorption data for terbium (Tb) onto engineered E. coli with lanthanide-binding tags, tested across temperatures up to 100 °C in synthetic Great Salt Lake brine. Data is in Excel format with ICP-MS measurements. AmeriGEOSS DataData.gov
Why it matters: Useful for modeling temperature-dependent metal adsorption and microbial optimization in biomining contexts.
2. Omics Data on Acidophile Biofilms in Metal Extraction
What it is: Systems biology datasets—including proteomics, transcriptomics (RNA-seq), and fluorescence microscopy—focused on acidophilic biofilms (e.g., Leptospirillum ferriphilum, Sulfobacillus thermosulfidooxidans, Acidithiobacillus caldus) involved in metal-extraction. Accessible via FAIRDOMHub, with raw RNA-seq available in ENA and proteomics in PRIDE. NaturePMC
Why it matters: Offers a high-resolution view of gene and protein expression in bioleaching scenarios—valuable for dissecting mechanisms underlying microbial metal extraction.
3. Metabolomics of Chilean Biomining Bacteria
What it is: Metabolite profiles (using CE-MS) of A. ferrooxidans strain Wenelen and A. thiooxidans strain Licanantay grown with different substrates (iron, sulfur, chalcopyrite), in both sessile and planktonic states. Reveals key metabolites like glutathione and spermidine. PMCPubMed
Why it matters: Illuminates metabolic pathways crucial for biofilm formation and metal oxidation—especially sulfur and energy metabolism—within biomining microorganisms.
Broader Repositories and Resources
4. “Awesome-mining-data” (GitHub Repository)
What it is: A curated list of open datasets related to the mining industry—not limited to biomining—including geological, geophysical, chemical, and assay data such as:
Thalanga (2024): Drilling, geology, geophysics
GeoMet (2022): Geochemical analyses
Macpass (2023): Drillholes and topography
Other synthetic or real datasets (e.g., Porphyry, MineLib) GitHub
Why it matters: While not microbial-centric, these data sources can support cross-disciplinary modeling, simulation, and integration of biomining processes within geological contexts.
5. National Microbiology Data Center (NMDC)
What it is: A large-scale Chinese microbiological data hub integrating over 800 databases and 6 petabytes of data, including microbial omics, strain catalogs, and more. Databases include the Global Catalogue of Microorganisms, type strain genomes, and standardized microbiome analyses (gcMeta). Wikipedia
Why it matters: Excellent source for reference genomes, strain-level data, and comparative analysis—especially for extremophiles used in biomining.
Summary Table
Dataset ResourceTypeKey FocusE. coli engineered adsorption dataExperimental adsorption dataRare-earth metal binding & temperatureAcidophile biofilm omics (FAIRDOMHub)RNA-seq, proteomics, imagingBiofilm regulation and metal extractionMetabolomics of A. ferrooxidans & A. thiooxidansMetabolic profilingSulfuriron metabolism, stress responseAwesome-mining-data (GitHub)Geoscience and mining metadataMining context for biomining integrationNMDC (China)Microbial omics & strain dataReference data and comparative analysis
Suggestions for Next Steps
Define Your Objective: Are you modeling adsorption kinetics, exploring microbial gene expression, or integrating geochemical context? The best dataset depends on your research angle.
Integrate Multiple Layers: Consider combining metabolomics and transcriptomics datasets (e.g., omics + metabolite profiles) to build mechanistic models of biomining activity.
Expand Scope with Geodata: If your work involves simulating ore environments or designing experiments, geoscience datasets (from Awesome-mining-data) provide helpful context.
Explore NMDC for Strain-Level Comparisons: Use genomic or microbial catalog data from NMDC to compare strains or support discovery of biomining-relevant patterns.
From the Community: Share and Discover Datasets
Here's a relevant thought from practitioners browsing biomining and related domains:
“On NCBI GEO/SRA you will find almost all type of biological dataset. Just try to reproduce one already published.” Reddit
If you haven't yet explored NCBI GEO/SRA, those repositories might also house biomining-relevant transcriptomics, proteomics, or metagenomics data—especially from mining-related microbial studies.