Scientific Lake service bundle
The service bundle is designed to collect, manage, and query heterogeneous scholarly content. It provides functionalities, components and open APIs to support research activities.
The bundle has the following components
Enhancing metadata through text and data mining
Information Inference Service
Information Inference Service (IIS) is a flexible data processing system for handling big data based on Apache Hadoop technologies. It is a subsystem of the OpenAIRE system and it uses algorithms to extract new entities and relations from full texts to enrich SKGs.
Unlocking knowledge through PDF acquisition
PDFfetcher
PDFfetcher is a tool designed to acquire the full text of publications by collecting PDFs from URL links. With a coverage of over 60 million PDF articles, it provides a comprehensive resource for researchers.
Domain-Specific Machine Translation
Machine Translation System
The Machine Translation system ensures accurate and contextually appropriate translations by fine-tuning general-purpose machine translation models with domain-specific scientific data.
Data Science Tool for Heterogeneous Network Mining
SciNeM
SciNeM is data science tool for metapath-based querying and analysis of Heterogeneous Information Networks. It enables entity ranking, similarity searches, and community detection.
Simplifying access to knowledge
Lake API
Open API is an initiative that aims to hide technical complexities and provide a user-friendly interface for accessing information.
Discovering Dependencies, Enriching Knowledge
KG creation assistant & Interlinking
The Knowledge Graph creation assistant & Interlinking tool is designed to extract knowledge graphs from unstructured or semi-structured data sources and enrich their content.
Enriching research through comprehensive resource description
Data Catalogue
The SciLake Catalogue is a central registry for discovering Scientific Knowledge Graphs (SKGs) and tools across the SciLake ecosystem.
High-performance graph analytics
AvantGraph
AvantGraph is a tool that supports on-top services to perform analytics on graphs. It offers a high-performance graph processing engine for scientific data lakes, allowing a wide range of data processing tasks.