Data platform migration Accelerator
Automates the conversion of legacy data pipelines to dbt,
Why we built this
Data platform migrations are tedious exercises that typically go over budget. GSI’s usually staff those projects with a large number of novice engineers, leading to customer dissatisfaction, missed deadlines and decreased trust in the future data platfom.
Migrations, however, are a repetitive task. Built upon years of consulting experience in migration projects, we built Kali. Our in-house methodology.
How it works
1. Analyze
Get an understanding of your inventory.
- Code composition and job complexity grouping
- Job categorization, such as Ingest, Transform, and Outbound pipelines
- End-to-end job lineage (Scheduler > ETL Pipelines > Source/Target Tables)
- Capture downstream lineage and dependencies between consumer and data sources
- Identification of “dead” or “orphan” jobs that can be ignored in the migration
2. Replatform
Lift-and-shift your legacy codebase to a contemporary dbt platform.
- Replace legacy designer patterns
- Build dbt project
- Test the migrated developments
3. Refactor
Improve the design and performance of your migrated code
Resources
Apache Hive
Migrate Apache Hive and other SQL-based data pipeline tools into dbt projects.
Apache Spark
Migrate Apache Spark into dbt projects. Either as PySpark pipelines, or convert your legacy to SQL code.
How to get access
Kali is ready for the enterprise. We offer access in line with your requirements regarding intellectual property, procurement processes and strategic agenda.
As part of a migration project
Work with Tropos to migrate your legacy data platform to Snowflake, Databricks or Redshift.
Through the Snowflake Marketplace
Convert your legacy platform straight away from within your contract perimeters
Through the AWS Marketplace
Procure your license via existing contracts and leverage your EDP commitment.