Skip to content

Store for sharing schemas (Thrift, Smile, ...) between HDFS, DataWarehouses, ...

Notifications You must be signed in to change notification settings

pierre/goodwill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goodwill, a Thrift types store

Goodwill is a store to share Thrift schemas.

It provides an HTML-based UI and a JSON API to create and document them. Goodwill is fully integrated with the action-core (github.com/pierre/action-core) to browse Thrift-based sequence files on HDFS. Another usecase is to export Thrift data from HDFS to a SQL-based data warehouse: Goodwill is the middle-man that defines field types and does the translation between Thrift types and SQL types.

Usage

To get started, one can use the CSV file store:

mvn -Dgoodwill.store.type=csv -Dgoodwill.store.csv.file.path=examples/file_store.csv jetty:run

This creates a store with a single type Awesomeness. You can browse it at:

https://127.0.0.1:8080/registrar

To access the API, specify the Accept header:

curl -H'Accept: application/json' https://127.0.0.1:8080/registrar/Awesomeness

Note: the CSV file store is currently read-only. If you want to fix it, look at updateType in github.com/pierre/goodwill/blob/master/src/main/java/com/ning/metrics/goodwill/store/CSVFileStore.java.

A more common usecase is to use the MySQL backend. Simply create a goodwill database and create the following table:

CREATE TABLE `thrift_types` (
`event_type` varchar(255) DEFAULT NULL,
`field_id` int(11) DEFAULT NULL,
`field_type` varchar(255) DEFAULT NULL,
`field_name` varchar(255) DEFAULT NULL,
`sql_type` varchar(255) DEFAULT NULL,
`sql_length` int(11) DEFAULT NULL,
`sql_scale` int(11) DEFAULT NULL,
`sql_precision` int(11) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_fields` (`event_type`,`field_id`)
) ENGINE=InnoDB AUTO_INCREMENT=114 DEFAULT CHARSET=utf8

You can now run Goodwill via

mvn jetty:run

You can configure your sink via Goodwill. A sink is where the data described in Goodwill is stored. For instance, it can JDBC to your data warehouse to create the tables that will hold the events. For now, only Netezza is supported:

mvn -Dgoodwill.sink.type=netezza -Dgoodwill.sink.db.host=netezza.company.com -Dgoodwill.sink.db.name=company \
    -Dgoodwill.sink.db.user=pierre -Dgoodwill.sink.db.password=pierre -Dgoodwill.sink.db.port=5480 jetty:run

You may want to look at the configuration parameter goodwill.sink.db.extra_sql in GoodwillConfig.java to see how to customize it.

Configuration

See github.com/pierre/goodwill/blob/master/src/main/java/com/ning/metrics/goodwill/binder/config/GoodwillConfig.java.

To build

mvn package

will create a war.

License (see LICENSE-2.0.txt file for full license)

Copyright 2010 Ning

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Store for sharing schemas (Thrift, Smile, ...) between HDFS, DataWarehouses, ...

Resources

Stars

Watchers

Forks

Packages

No packages published