By Data Engineers for Data Engineers!This is not a site for professional teachers, talkers, paper engineers and those how are experts in tech but never built anything. We are on the front lines daily building data pipelines, streaming apps, big data analytics. We simply share the code, tutorials, tips and tricks that actually work in the field, not on paper. Contact us for info, questions or feedback on articles and as always feel free to contribute code or articles. Joe TriteFounder 2298 Software joe@2298-software.com https://github.com/2298-Software iluv2cum.com Tech Articles |
||
Tube Site AutomagicallyWe chatted with our CEO about new projects and we were truly surprise at his answer. |
Change Data Capture with Apache SparkMany team give up on Change Data Capture (CDC) on Big Data due to complexity. This tutorial provides a simple pattern to reliably capture changes |
How much do you pay for a line of code?Have you every been asked why it takes so long to complete or provide an accurate estimate of an IT Project? |
Upserting S3 Data Using Hudi & PySparkApache Hudi is a powerful framework for managing storage of large datasets like those typically found in Amazon S3. |
Synchronizing s3 Data with HudiHow to create a large, multi-tenant, enterprise grade Hadoop Data Lake |
Hadoop to AWS Migration PlanLearn how to migrate an enterprise hadoop platform to AWS |
Data Engineering Best PracticesA list of best practices taken from 20+ years working in IT. |
Dockerize HTTP Service Latency MonitorHTTP Service monitor that captures performance statistics of a provided list of URLs |
Infrastructure as DiagramMany shops are adopting Infrastructure as Code (IaC), but what if you could skip the code and create your infrastructure from a diagram? |
Enterprise HadoopHow to create a large, multi-tenant, enterprise grade Hadoop Data Lake |
Spark Performance GuideA guide on common performance problems faced by engineers and how to fix them. |
Setting Up a Confluent Kafka Docker Cluster TutorialLearn how to create a full Confluent Kafka cluster on your laptop using docker |
PySpark Structured Streaming with Confluent KafkaLearn how to stream data with Confluent Kafka cluster using PySpark |
Spark on a WindowsLearn how to setup Apache Spark on a Windows computer |
Load Forecasting with ProphetLoad Forecasting example using Facebook's prophet framework |