Data Modelling and Import
RDF Insert Methods in Virtuoso
Loading RDF Datasets into a Virtuoso Graph IRI
Virtuoso supports various methods for inserting RDF data into its triple store. Below are the steps for loading RDF datasets, applicable to both TTL and RDF/XML formats. Specific differences in commands for each format are highlighted separately.
Method 1: Using Command Prompt
Common Steps for Loading RDF Data
-
Prepare the RDF Data File
Ensure your RDF data is in the correct file format:- Turtle (TTL)
- RDF/XML
-
Start Virtuoso
Ensure that your Virtuoso server is running. -
Connect to Virtuoso Using isql
Open a terminal and connect to your Virtuoso instance: -
Loading RDF Data
Loading TTL Data
To load a Turtle (TTL) file, use theDB.DBA.TTLP_MT
function:Loading RDF/XML Data
To load an RDF/XML file, use theDB.DBA.RDF_LOAD_RDFXML_MT
function:Guide to Parameters:
file_to_string_output('/path/to/your/file.ttl' or '/path/to/your/file.rdf')
: Path to your TTL or RDF/XML file.''
: Base IRI (leave as an empty string if not needed).'http://example.com/graph'
: The target graph IRI where the triples will be stored.0
or2
: Parsing/logging mode (0 for TTL, 2 for RDF/XML to log everything).4
: Number of threads to use for loading.1
: Transactional mode (0 for non-transactional, 1 for transactional).
-
Verifying Data Load
After loading the data, you can verify the import by running a SPARQL query: -
Committing the Transaction
If you are running in transactional mode, ensure the data is saved by committing the transaction:
Method 2: Using Virtuoso Conductor (User Interface)
Step-by-Step Guide to Upload RDF Data Using Quad Store Upload
-
Prepare Your RDF File
Make sure your RDF data is in a proper Turtle (.ttl) or RDF/XML (.rdf) format. -
Open Virtuoso Conductor Interface
Open your web browser and navigate to the Virtuoso Conductor interface athttp://localhost:8890/conductor
. -
Navigate to Linked Data > Quad Store Upload
In the Conductor interface, click on Linked Data in the left-hand menu. Select Quad Store Upload to open the RDF data upload form. -
Upload RDF File
Click on the Choose file button and select your RDF file from your local machine. -
Set Target Graph IRI
Specify a meaningful Named Graph IRI where the data will be stored, e.g.,http://example.com/NFDI_Ontology
. -
Start Import
Click on the Upload button to start the import process. Virtuoso will upload the file and insert the triples into the specified graph. -
Verify Import
Open Interactive SQL in the Conductor interface and run a SPARQL query to verify that the data has been loaded correctly:
Bulk Loading of RDF Datasets into Virtuoso
This guide outlines the steps for bulk loading large RDF datasets into Virtuoso, which may involve multiple files and loading them into one or several graphs.
Prerequisites
- Virtuoso Bulk Loader Functions: Ensure these functions are available (pre-loaded in Virtuoso 06.02.3129 and later).
- Directory Permissions: The directory containing the RDF files must be listed in the
DirsAllowed
parameter in the Virtuoso INI file, followed by a server restart. - System Configuration: Configure memory and resources as per the Virtuoso RDF Performance Tuning Guide.
- Supported File Formats: Files must be in .grdf, .nq, .nt, .owl, .rdf, .trig, .ttl, .xml, or compressed formats (.gz, .bz2, .xz).
Bulk Loading Process
Specify Graph IRI
Place the graph IRI, e.g., http://dbpedia.org
, in a .graph
file in the same directory as the RDF files.
Register Files for Loading
Use ld_dir()
or ld_dir_all()
to register files for bulk loading:
Check Registered Files
Use DB.DBA.load_list
to check the list of registered files:
Execute Bulk Load
Run the bulk load by calling the rdf_loader_run()
function:
Finalize Loading
Run a checkpoint to ensure the data is committed:
Running Multiple Loaders
For optimal performance, run multiple loader processes in parallel:
Stopping the Bulk Load Process
All RDF loader threads can be stopped using:
Checking Bulk Load Status
Check the DB.DBA.load_list
to confirm successful loads:
Virtuoso CSV File Bulk Loader
The Virtuoso CSV File Bulk Loader enables efficient bulk loading of CSV files into Virtuoso, storing them as tables. This section provides an overview of the key functions, configuration steps, and examples of usage.
CSV Bulk Load Functions
The following functions are used for performing CSV bulk load operations:
csv_register(path, mask)
: Registers CSV files matching the mask in the specified directory.csv_register_all(path, mask)
: Registers CSV files recursively from the specified directory.csv_loader_run(max_files, log_enable := 2)
: Executes the bulk loader to load data into the database. You can specify the maximum number of files and setlog_enable=2
to minimize locks during the load.
Configuration and Usage
-
Directory Configuration
Ensure all directories containing CSV files to be loaded are included in theDirsAllowed
parameter of the active Virtuoso INI file. -
Table Creation (Optional)
If a CSV file contains no headers indicating column names, create a table manually before importing. The CSV file will be loaded into this table. -
Table Mapping
If you want to load a CSV file into a specific table, create a.tb
file with the same name as the CSV file. This file should contain a single entry with the fully qualified name of the target table. -
CSV Configuration
If the CSV file structure differs from the default configuration, create a.cfg
file with the same name as the CSV file. This file should contain parameters that define the structure of the CSV file, such as delimiter, quote character, and header line.- Invisible “tab” and “space” delimiters should be specified by those names, without the quotation marks.
- Other delimiter characters (comma, period, etc.) should simply be typed in.
- “Smart” quotation marks which differ at start and end (including but not limited to « », ‹ ›, “ ”, and ‘ ’) are not currently supported.
Example of a CSV Configuration File
Consider loading a gzipped CSV file, csv-example.csv.gz, with the non-default CSV structure below:
In this example
- the header is on the third line, #2 with a zero-base
- the data starts from the fifth line, #4 with a zero-base
- the delimiter is tab
- the quote char is the single-quote, or apostrophe Loading this file requires the creation of a configuration file, csv-example.cfg, containing the entries:
This configuration tells Virtuoso how to interpret the CSV structure for proper loading.
Loading CSV Files
-
Register the CSV Files
Register the CSV files to be loaded using thecsv_register()
function: -
Execute Bulk Load
Load the CSV files into Virtuoso by executing: