A Guide to the Lake API:
Querying, Filtering, and Exploring Scientific Knowledge Graphs
Scientific Knowledge Graphs represent research information as a network: research products are linked to people, organisations, venues, topics, funding, and domain-specific concepts. This blog introduces the SciLake Lake API, which provides a single GraphQL endpoint to query these links efficiently and at scale.
What the Lake API provides
The Lake API is a GraphQL web service that provides unified access to SciLake’s Scientific Knowledge Graphs. It is designed to support structured exploration of research information represented as a network of connected entities (e.g., research products, people/organisations, venues, topics, funding, and domain-specific concepts).
The Lake API supports the following core query patterns:
- Unified access across graphs and domains: a single schema exposes both core scholarly entities and pilot-specific extensions, reducing the need to learn different interfaces for different domains.
- Selective field retrieval (GraphQL): each request can specify exactly which fields to return, improving clarity and reducing payload size.
- Advanced filtering: queries can combine exact or partial matches, list-based conditions, and nested logical expressions (
AND/OR), including constraints that span multiple entity types. - Pagination and sorting: large result sets can be browsed incrementally and consistently, which is essential for reliable analysis workflows.
- Relationship traversal: a single query can follow links between entities (for example, from a research product to associated technologies, authors, venues, or topics), enabling richer context without multiple round queries.
How queries are executed
When a query is submitted:
- The GraphQL layer validates the operation against the unified schema.
- A request header selects the target graph (multi-graph support without separate endpoints).
- Resolver logic translates the GraphQL request into one or more Cypher queries.
- The graph backend executes the query; results are aggregated and returned as structured JSON.
Lake API high-level architecture
A key strength of the Lake API is the ability to combine structured filtering with relationship traversal. For example, users can retrieve research products:
- linked to a specific technology, and
- meeting a citation-count threshold,
- while selecting only the relevant fields for analysis.
This supports efficient discovery workflows where users iteratively refine filters and expand traversal to related entities (agents, technologies, venues, topics, and domain-specific concepts).
Where to explore the API
The documentation provides an overview of available graphs, entity types, and example queries, alongside an interactive GraphiQL environment for schema introspection and query prototyping:
- Documentation: https://scilake-api.athenarc.gr/
- GraphQL endpoint: https://scilake-api.athenarc.gr/graphql