Forge Tools ncbi_biosample

ncbi_biosample read

scidex.forge.ncbi_biosample

Look up NCBI BioSample per-sample biological and clinical metadata. Returns accession (SAMN/SAME/SAMD), title, organism, taxonomy_id, attributes dict (tissue, disease, sex, age, developmental_stage, treatment, …), SRA run count, and submission date. Supports three query modes via query_type: 'accession' for direct SAMN/SAME/SAMD lookup, 'keyword' for full-text sample search, 'bioproject' for all samples linked to a PRJNA/PRJEB/PRJDB project. Complements scidex.forge.bioproject (project-level) and scidex.forge.sra_run (run-level) with specimen-level context. API: NCBI Entrez E-utilities (esearch + efetch, db=biosample). Free, no auth required. Set NCBI_API_KEY for 10 rps (default 3 rps anonymous).

HTTP: POST /api/scidex/forge/ncbi_biosample

Invoke

Calls scidex.tool.invoke on the substrate with this tool name. Edit the JSON below — it must match the input schema. The substrate runs the tool, records the call in substrate_tool_calls, and returns a structured envelope.

Sign in to invoke this tool. Schema and curl snippet are visible to anyone.

Schemas

Input schema
{
  "additionalProperties": false,
  "description": "Input schema for ``scidex.forge.ncbi_biosample``.",
  "properties": {
    "query": {
      "description": "BioSample accession (SAMN*/SAME*/SAMD*), keyword search, or BioProject accession (PRJNA*/PRJEB*/PRJDB*) depending on query_type.",
      "maxLength": 512,
      "minLength": 1,
      "title": "Query",
      "type": "string"
    },
    "query_type": {
      "default": "accession",
      "description": "How to interpret query: ``'accession'`` — direct SAMN/SAME/SAMD lookup; ``'keyword'`` — full-text BioSample search; ``'bioproject'`` — fetch all samples linked to a PRJNA/PRJEB/PRJDB project.",
      "title": "Query Type",
      "type": "string"
    },
    "limit": {
      "default": 10,
      "description": "Maximum number of BioSample records to return (1–100, default 10).",
      "maximum": 100,
      "minimum": 1,
      "title": "Limit",
      "type": "integer"
    }
  },
  "required": [
    "query"
  ],
  "title": "ForgeNCBIBioSampleIn",
  "type": "object"
}
Output schema
{
  "$defs": {
    "NCBIBioSampleEntry": {
      "description": "One NCBI BioSample record.",
      "properties": {
        "accession": {
          "description": "BioSample accession (e.g. ``'SAMN00000001'``).",
          "title": "Accession",
          "type": "string"
        },
        "title": {
          "default": "",
          "description": "Sample title or description.",
          "title": "Title",
          "type": "string"
        },
        "organism": {
          "default": "",
          "description": "Organism scientific name (e.g. ``'Homo sapiens'``).",
          "title": "Organism",
          "type": "string"
        },
        "taxonomy_id": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "NCBI taxonomy ID for the organism (None if not reported).",
          "title": "Taxonomy Id"
        },
        "attributes": {
          "additionalProperties": {
            "type": "string"
          },
          "description": "All sample attributes keyed by attribute_name (e.g. tissue, disease, sex, age, developmental_stage, treatment). Values are strings.",
          "title": "Attributes",
          "type": "object"
        },
        "sra_run_count": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "Number of associated SRA runs (None if not reported in record).",
          "title": "Sra Run Count"
        },
        "submission_date": {
          "default": "",
          "description": "Submission date (YYYY-MM-DD or ISO-8601 partial).",
          "title": "Submission Date",
          "type": "string"
        }
      },
      "required": [
        "accession"
      ],
      "title": "NCBIBioSampleEntry",
      "type": "object"
    }
  },
  "description": "Response shape for ``scidex.forge.ncbi_biosample``.",
  "properties": {
    "results": {
      "description": "BioSample records matching the query.",
      "items": {
        "$ref": "#/$defs/NCBIBioSampleEntry"
      },
      "title": "Results",
      "type": "array"
    },
    "not_found": {
      "default": false,
      "description": "True when the query returned no matching samples.",
      "title": "Not Found",
      "type": "boolean"
    },
    "total_found": {
      "default": 0,
      "description": "Total BioSample UIDs returned by esearch before the limit.",
      "title": "Total Found",
      "type": "integer"
    },
    "query": {
      "default": "",
      "description": "The original query string.",
      "title": "Query",
      "type": "string"
    },
    "took_ms": {
      "description": "Wall-clock time for all upstream calls in ms.",
      "title": "Took Ms",
      "type": "integer"
    }
  },
  "required": [
    "took_ms"
  ],
  "title": "ForgeNCBIBioSampleOut",
  "type": "object"
}

curl snippet

Replace $SCIDEX_JWT with a valid bearer token. Read verbs are usually accessible without auth in dev; production requires a JWT.

curl -sS -X POST '/api/scidex/forge/ncbi_biosample' \
  -H 'authorization: Bearer $SCIDEX_JWT' \
  -H 'content-type: application/json' \
  -d '{
  "query": ""
}'

Discussion

Posting anonymously. Sign in for attribution.

No comments yet — be the first.