Name		Name	Last commit message	Last commit date
parent directory ..
csv		csv
delta		delta
iceberg		iceberg
json		json
lance/table1		lance/table1
parquet		parquet
pgprototest		pgprototest
pgprototest_glaredb		pgprototest_glaredb
prql_integration		prql_integration
sqllogictests		sqllogictests
sqllogictests_bigquery		sqllogictests_bigquery
sqllogictests_cassandra		sqllogictests_cassandra
sqllogictests_clickhouse		sqllogictests_clickhouse
sqllogictests_datasources_common		sqllogictests_datasources_common
sqllogictests_iceberg		sqllogictests_iceberg
sqllogictests_mongodb		sqllogictests_mongodb
sqllogictests_mysql		sqllogictests_mysql
sqllogictests_native		sqllogictests_native
sqllogictests_object_store		sqllogictests_object_store
sqllogictests_postgres		sqllogictests_postgres
sqllogictests_snowflake		sqllogictests_snowflake
sqllogictests_sqlserver		sqllogictests_sqlserver
tpch		tpch
xlsx		xlsx
README.md		README.md
generate_iceberg.py		generate_iceberg.py

README.md

Testdata

Data useful for testing.

Iceberg

Generating iceberg data is done with the generate_iceberg.py script. This script requires pyspark as well as a Spark runtime jar. The jar should be placed in this directory (jars are in gitignore, so don't worry about accidentally checking it in).

Then just call the script with python:

$ python generate_iceberg.py

This will generate various iceberg tables in ./iceberg/tables using source data from parquet files in ./iceberg/source_data.

Test data has also been uploaded to GCS and S3 with the following commands:

$ gsutil cp -r iceberg/ gs:https://glaredb-test/iceberg
$ aws s3 cp --recursive iceberg/ s3:https://glaredb-test/iceberg

Source data

Source data was generated with sql queries like the following:

copy (select *
      from parquet_scan('./benchmarks/artifacts/tpch_1/lineitem/part-0.parquet')
      order by random()
      limit 1000)
  to 'testdata/iceberg/source_data/lineitem.parquet';

Local tests

The following command can be used to test local iceberg tables:

$ cargo test --test sqllogictests -- 'sqllogictests_iceberg/local'

PRQL integration tests

The directory prql_integration contains data from the PRQL repo.

There's nothing specific to PRQL in these test files as they're just CSVs, but they're copied in such that we can have PRQL integration tests run in this repo using our SLT framework.

Azure storage container

The storage container (bucket) in Azure was bootstrapped via the azcopy utility.

azcopy copy \
  'https://storage.cloud.google.com/glaredb-test' \
  'https://glaredbtest.blob.core.windows.net/glaredb-test?<sas-token>' \
  --recursive=true

Running this command should copy everything from the GCS bucket into the storage container.

Requirements:

GOOGLE_APPLICATION_CREDENTIALS env var pointing to a service account file. Note this can't be application default credentials.
<sas-token> needs to be replaced with a token with "Create", "Add", and "Write" permissions. This can be done through the azure dashboard.

(Untested): The sync command should let us update the contents of the azure container when we add additional test data to the gcs bucket.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testdata

testdata

README.md

Testdata

Iceberg

Source data

Local tests

PRQL integration tests

Azure storage container

Files

testdata

Directory actions

More options

Directory actions

More options

Latest commit

History

testdata

Folders and files

parent directory

README.md

Testdata

Iceberg

Source data

Local tests

PRQL integration tests

Azure storage container