Forge Tools tcga_genomics

tcga_genomics read

scidex.forge.tcga_genomics

Query NCI Genomic Data Commons (GDC) for TCGA somatic mutation statistics. Returns per-gene mutation counts, affected case counts, mutation frequency, and top amino acid changes across 33 TCGA cancer types (>11,000 cases). Optional cancer_type filter narrows to a single TCGA project (e.g. TCGA-LUAD). Complements ClinVar (germline pathogenicity) and GWAS Catalog (common variants) with somatic cancer mutation evidence. API: https://api.gdc.cancer.gov/ — public, no API key required.

HTTP: POST /api/scidex/forge/tcga_genomics

Invoke

Calls scidex.tool.invoke on the substrate with this tool name. Edit the JSON below — it must match the input schema. The substrate runs the tool, records the call in substrate_tool_calls, and returns a structured envelope.

Sign in to invoke this tool. Schema and curl snippet are visible to anyone.

Schemas

Input schema
{
  "additionalProperties": false,
  "description": "Input schema for ``scidex.forge.tcga_genomics``.",
  "properties": {
    "gene_symbols": {
      "description": "HGNC gene symbols to query (e.g. ``['TP53', 'KRAS']``). Returns somatic mutation statistics per gene across TCGA projects.",
      "items": {
        "type": "string"
      },
      "title": "Gene Symbols",
      "type": "array"
    },
    "cancer_type": {
      "anyOf": [
        {
          "maxLength": 64,
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "TCGA project code filter (e.g. ``'TCGA-LUAD'`` for lung adenocarcinoma, ``'TCGA-BRCA'`` for breast cancer). When None, aggregates across all 33 TCGA projects.",
      "title": "Cancer Type"
    },
    "data_type": {
      "default": "mutation",
      "description": "Data category to retrieve: ``'mutation'`` (somatic SSMs), ``'expression'`` (requires TCGA_API_TOKEN for controlled access), or ``'both'``. Default: ``'mutation'``.",
      "title": "Data Type",
      "type": "string"
    },
    "limit": {
      "default": 20,
      "description": "Maximum results to return (gene × cancer-type pairs). Default 20.",
      "maximum": 200,
      "minimum": 1,
      "title": "Limit",
      "type": "integer"
    }
  },
  "required": [
    "gene_symbols"
  ],
  "title": "ForgeTCGAIn",
  "type": "object"
}
Output schema
{
  "$defs": {
    "TCGAMutationResult": {
      "description": "One gene × cancer-type somatic mutation summary from TCGA/GDC.",
      "properties": {
        "gene_symbol": {
          "description": "HGNC gene symbol.",
          "title": "Gene Symbol",
          "type": "string"
        },
        "cancer_type": {
          "description": "TCGA project code (e.g. ``TCGA-LUAD``).",
          "title": "Cancer Type",
          "type": "string"
        },
        "mutation_count": {
          "default": 0,
          "description": "Total somatic mutations (SSMs) in this gene for this project.",
          "title": "Mutation Count",
          "type": "integer"
        },
        "case_count": {
          "default": 0,
          "description": "Unique cases (patients) with at least one SSM in this gene.",
          "title": "Case Count",
          "type": "integer"
        },
        "mutation_frequency": {
          "default": 0,
          "description": "Average SSMs per affected case: mutation_count / case_count. Combine with total_cases in ForgeTCGAOut for population-level frequency.",
          "title": "Mutation Frequency",
          "type": "number"
        },
        "top_mutations": {
          "description": "Up to 5 most common amino acid changes (e.g. ``['p.R175H', 'p.R248W']``).",
          "items": {
            "type": "string"
          },
          "title": "Top Mutations",
          "type": "array"
        },
        "significance": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "Most prevalent mutation consequence type (e.g. ``'missense_variant'``).",
          "title": "Significance"
        }
      },
      "required": [
        "gene_symbol",
        "cancer_type"
      ],
      "title": "TCGAMutationResult",
      "type": "object"
    }
  },
  "description": "Response shape for ``scidex.forge.tcga_genomics``.",
  "properties": {
    "results": {
      "description": "Per-gene mutation summaries, one per (gene × cancer_type) pair.",
      "items": {
        "$ref": "#/$defs/TCGAMutationResult"
      },
      "title": "Results",
      "type": "array"
    },
    "total_cases": {
      "default": 0,
      "description": "Total cases in the queried TCGA project. 0 when cancer_type is None or project info is unavailable.",
      "title": "Total Cases",
      "type": "integer"
    },
    "project_name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Human-readable TCGA project name (e.g. ``'Lung Adenocarcinoma'``).",
      "title": "Project Name"
    }
  },
  "title": "ForgeTCGAOut",
  "type": "object"
}

curl snippet

Replace $SCIDEX_JWT with a valid bearer token. Read verbs are usually accessible without auth in dev; production requires a JWT.

curl -sS -X POST '/api/scidex/forge/tcga_genomics' \
  -H 'authorization: Bearer $SCIDEX_JWT' \
  -H 'content-type: application/json' \
  -d '{
  "gene_symbols": []
}'

Discussion

Posting anonymously. Sign in for attribution.

No comments yet — be the first.