About me

Data Engineer

Suranjit Banik
Toronto, Canada M5A0H6 | 647-616-9494 | suran.cse@gmail.com
LinkedIn

Data Engineer with 3+ years of experience in building data-intensive applications, tackling challenging architectural and scalability problems in Media. Currently enhancing GroupMā€™s capabilities with petabyte-scale data pipelines. Adept at constructing new systems to meet complex business needs. Advanced skills in Python, SQL, BigQuery, and Big Data technologies on the cloud.

SKILLS

  • Google Cloud Platform (GCP)
  • Database Design
  • Python (Programming Language)
  • SQL/noSQL
  • Databricks

Professional Summary

  • Infrastructure Development
  • Training & Development
  • Data Management
  • Supervision & Leadership

  • Problem Resolution
  • Team Building
  • Planning & Organizing
  • Good Work Ethic
  • Critical Thinking

Languages and Technologies

  • Language: Python, SQL
  • Big Data Processing
  • Data Warehousing
  • Scripting Languages
  • Data Pipeline Design

Tech Stack: Databricks - PySpark on GCP (BigData), HIVE Metastore, Apache AirFlow (Google Cloud Composer), GCS, S3, BigQuery, Dataflow, DataProc, Redshift, Pub/ Sub, Cloud Function, Google CLI, Docker, GKE, GitHub, Git Currently Building Data Pipeline infrastructure on Google Cloud Platform

Databases: MySQL, PostgreSQL, BigQuery, BigTable, Redshift

Clod Storage: GCS, S3, BlobStorage

Containerization: Docker

Realtime Streaming: Pub\Sub, DataFlow, BigQuery

Cluster Management Engine: GKE

Data Warehouse: BigQuery, Redshift, Azure Synapse Analytics

ETL Tools: Airflow, Databricks, DataFlow, DataProc, Data Factory

Orchestration & Scheduler: Airflow, Databricks

Leave a Comment