Currently, Gobblin supports the following feature list:
Different Data Sources
Different Pulling Types
- SNAPSHOT-ONLY: Pull the snapshot of one dataset.
- SNAPSHOT-APPEND: Pull delta changes since last run, optionally merge delta changes into snapshot (Delta changes include updates to the dataset since last run).
- APPEND-ONLY: Pull delta changes since last run, and append to dataset.
Different Deployment Types
- standalone deploy on a single machine
- cluster deploy on hadoop 2.3.0
- Merge delta changes into snapshot.