About me
Data Engineer
Suranjit Banik
Toronto, Canada M5A0H6 | 647-616-9494 | suran.cse@gmail.com
LinkedIn
Data Engineer with 3+ years of experience in building data-intensive applications, tackling challenging architectural and scalability problems in Media. Currently enhancing GroupMās capabilities with petabyte-scale data pipelines. Adept at constructing new systems to meet complex business needs. Advanced skills in Python, SQL, BigQuery, and Big Data technologies on the cloud.
SKILLS
- Google Cloud Platform (GCP)
- Database Design
- Python (Programming Language)
- SQL/noSQL
- Databricks
Professional Summary
- Infrastructure Development
- Training & Development
- Data Management
-
Supervision & Leadership
- Problem Resolution
- Team Building
- Planning & Organizing
- Good Work Ethic
- Critical Thinking
Languages and Technologies
- Language: Python, SQL
- Big Data Processing
- Data Warehousing
- Scripting Languages
- Data Pipeline Design
Tech Stack: Databricks - PySpark on GCP (BigData), HIVE Metastore, Apache AirFlow (Google Cloud Composer), GCS, S3, BigQuery, Dataflow, DataProc, Redshift, Pub/ Sub, Cloud Function, Google CLI, Docker, GKE, GitHub, Git Currently Building Data Pipeline infrastructure on Google Cloud Platform
Databases:
MySQL
,PostgreSQL
,BigQuery
,BigTable
,Redshift
Clod Storage:
GCS
,S3
,BlobStorage
Containerization:
Docker
Realtime Streaming:
Pub\Sub
,DataFlow
,BigQuery
Cluster Management Engine:
GKE
Data Warehouse:
BigQuery
,Redshift
,Azure Synapse Analytics
ETL Tools:
Airflow
,Databricks
,DataFlow
,DataProc
,Data Factory
Orchestration & Scheduler:
Airflow
,Databricks
Leave a Comment