Skip to content

A library to transform basic types and case classes into SQL types and Schemas from different database, making possible to work with case classes when writing, reading or creating tables

License

Notifications You must be signed in to change notification settings

scala-steward/big-data-types

 
 

Repository files navigation

Big Data Types

CI Tests BQ IT Maven Central codecov Scala Steward badge

A type-safe library to transform Case Classes into Database schemas and to convert implemented types into another types

Documentation

Check the Documentation website to learn more about how to use this library

Available conversions:

From / To Scala Types BigQuery Spark Cassandra Circe (JSON)
Scala -
BigQuery -
Spark -
Cassandra -
Circe (JSON)

Versions for Scala Scala 2.12 ,Scala_2.13 and Scala 3.x are available in Maven

Quick Start

The library has different modules that can be imported separately

  • BigQuery
libraryDependencies += "io.github.data-tools" %% "big-data-types-bigquery" % "{version}"
  • Spark
libraryDependencies += "io.github.data-tools" %% "big-data-types-spark" % "{version}"
  • Cassandra
libraryDependencies += "io.github.data-tools" %% "big-data-types-cassandra" % "{version}"
  • Circe (JSON)
libraryDependencies += "io.github.data-tools" %% "big-data-types-circe" % "{version}"
  • Core
    • To get support for abstract SqlTypes, it is included in the others, so it is not needed if you are using one of the others
libraryDependencies += "io.github.data-tools" %% "big-data-types-core" % "{version}"

In order to transform one type into another, both modules have to be imported.

How it works

The library internally uses a generic ADT (SqlType) that can store any schema representation, and from there, it can be converted into any other. Transformations are done through 2 different type-classes.

Quick examples

Case Classes to other types

//Spark
val s: StructType = SparkSchemas.schema[MyCaseClass]
//BigQuery
val bq: List[Field] = SqlTypeToBigQuery[MyCaseClass].bigQueryFields // just the schema
BigQueryTable.createTable[MyCaseClass]("myDataset", "myTable") // Create a table in a BigQuery real environment
//Cassandra
val c: CreateTable = CassandraTables.table[MyCaseClass]

There are also extension methods that make easier the transformation between types when there are instances

//from Case Class instance
val foo: MyCaseClass = ???
foo.asBigQuery // List[Field]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable

Conversion between types works in the same way

// From Spark to others
val foo: StructType = myDataFrame.schema
foo.asBigQuery // List[Field]
foo.asCassandra("TableName", "primaryKey") // CreateTable

//From BigQuery to others
val foo: Schema = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable

//From Cassandra to others
val foo: CreateTable = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asBigQuery // List[Field]
foo.asBigQuery.schema // Schema

About

A library to transform basic types and case classes into SQL types and Schemas from different database, making possible to work with case classes when writing, reading or creating tables

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • Scala 95.1%
  • JavaScript 4.1%
  • CSS 0.8%