Forge Tools ncbi_protein

ncbi_protein read

scidex.forge.ncbi_protein

Look up protein sequences and annotations from NCBI Protein (RefSeq). Accepts a RefSeq accession (NP_, XP_, YP_) for exact lookup or a gene/protein name with optional organism filter for search. Returns accession, GI, protein description, organism, sequence length, amino acid sequence (accession queries), and cross-database references. Covers RefSeq isoforms and bacterial/viral proteins absent from UniProt. API: NCBI E-utilities (eutils.ncbi.nlm.nih.gov) — free, no auth required. Set NCBI_API_KEY for 10 rps (default 3 rps anonymous).

HTTP: POST /api/scidex/forge/ncbi_protein

Invoke

Calls scidex.tool.invoke on the substrate with this tool name. Edit the JSON below — it must match the input schema. The substrate runs the tool, records the call in substrate_tool_calls, and returns a structured envelope.

Sign in to invoke this tool. Schema and curl snippet are visible to anyone.

Schemas

Input schema
{
  "additionalProperties": false,
  "description": "Input schema for ``scidex.forge.ncbi_protein``.",
  "properties": {
    "query": {
      "description": "RefSeq accession (e.g. ``'NP_000537'``, ``'XP_011518068'``) or gene/protein name (e.g. ``'TP53'``, ``'p53'``). Accessions use the ``[accn]`` qualifier for exact lookup; names use a general protein-name search with optional organism filter.",
      "title": "Query",
      "type": "string"
    },
    "organism": {
      "anyOf": [
        {
          "maxLength": 128,
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Optional organism filter (e.g. ``'Homo sapiens'``, ``'Mus musculus'``). Passed as ``[orgn]`` qualifier in the NCBI search. Ignored for direct accession queries.",
      "title": "Organism"
    },
    "limit": {
      "default": 5,
      "description": "Maximum number of protein records to return. Default 5.",
      "maximum": 50,
      "minimum": 1,
      "title": "Limit",
      "type": "integer"
    }
  },
  "required": [
    "query"
  ],
  "title": "ForgeNCBIProteinIn",
  "type": "object"
}
Output schema
{
  "$defs": {
    "NCBIProteinEntry": {
      "description": "One NCBI Protein record.",
      "properties": {
        "accession": {
          "description": "Primary RefSeq accession (e.g. ``NP_000537``).",
          "title": "Accession",
          "type": "string"
        },
        "gi": {
          "default": 0,
          "description": "GenInfo Identifier (numeric). 0 when absent from the record.",
          "title": "Gi",
          "type": "integer"
        },
        "name": {
          "default": "",
          "description": "Protein description from GBSeq_definition.",
          "title": "Name",
          "type": "string"
        },
        "organism": {
          "default": "",
          "description": "Organism name from GBSeq_organism.",
          "title": "Organism",
          "type": "string"
        },
        "length": {
          "default": 0,
          "description": "Sequence length in amino acids.",
          "title": "Length",
          "type": "integer"
        },
        "sequence": {
          "default": "",
          "description": "Amino acid sequence. Populated for direct accession queries; may be empty for bulk gene-name results.",
          "title": "Sequence",
          "type": "string"
        },
        "db_xrefs": {
          "description": "Cross-database references from GBQualifier db_xref entries (e.g. ``'GeneID:7157'``, ``'HGNC:HGNC:11998'``, ``'taxon:9606'``).",
          "items": {
            "type": "string"
          },
          "title": "Db Xrefs",
          "type": "array"
        }
      },
      "required": [
        "accession"
      ],
      "title": "NCBIProteinEntry",
      "type": "object"
    }
  },
  "description": "Response shape for ``scidex.forge.ncbi_protein``.",
  "properties": {
    "results": {
      "description": "NCBI Protein records, one per entry found.",
      "items": {
        "$ref": "#/$defs/NCBIProteinEntry"
      },
      "title": "Results",
      "type": "array"
    },
    "total_found": {
      "default": 0,
      "description": "Total hits returned by esearch before the limit was applied.",
      "title": "Total Found",
      "type": "integer"
    },
    "not_found": {
      "default": false,
      "description": "True when esearch returned zero hits for the query.",
      "title": "Not Found",
      "type": "boolean"
    },
    "took_ms": {
      "description": "Wall-clock time for the upstream calls.",
      "title": "Took Ms",
      "type": "integer"
    }
  },
  "required": [
    "took_ms"
  ],
  "title": "ForgeNCBIProteinOut",
  "type": "object"
}

curl snippet

Replace $SCIDEX_JWT with a valid bearer token. Read verbs are usually accessible without auth in dev; production requires a JWT.

curl -sS -X POST '/api/scidex/forge/ncbi_protein' \
  -H 'authorization: Bearer $SCIDEX_JWT' \
  -H 'content-type: application/json' \
  -d '{
  "query": ""
}'

Discussion

Posting anonymously. Sign in for attribution.

No comments yet — be the first.